This page lists security considerations for the LicenseChooser.js. By exposing the design's use of input and the way it generates output and soliciting feedback on this, we hope to avoid correctness errors that can lead to vulnerabilities in our servers or the web applications of those who use our LicenseChooser.js.
Warning: This might be boring.
The LicenseChooser.js only creates DOM objects and CSS styles that begin with cc_js_, so they cannot overwrite your objects or styles unless you use the cc_js_ "namespace".
It also sends a reference to the desired localized template. This is based on the $_SERVER['SCRIPT_PATH'] and $_SERVER['HTTP_HOST'] header. The SCRIPT_PATH header is created by PHP and cannot be forged as far as I know. HTTP_HOST is taken from the user's Host: header, which in a normal HTTP/1.1 user agent is the domain of the web site the user is requesting the file from (in our case, it should always be api.creativecommons.org).
Spoofing the HTTP Host: header would cause Apache to dispatch to a different virtual host, so in effect it must be correct.
On page load, cc_js_init() examines the DOM to look for elements you can set that control the actions of LicenseChooser.js. If there is a cc_js_seed_uri attribute, it calls cc_js_license_url_to_attributes(). All data in this URL is either validated before it controls code flow or the input will be so malformed that execution will stop abruptly.
First that function verifies that the beginning of the URL indicates that it is a CC license. If the URL is malformed, the split("/") operation could fail, or access to elements of the resulting parts array could fail. If they succeed, then cc_js_set_version() is called which sets the license_array to have a particular version. This is validated by consulting the jurisdiction_array later on. cc_js_set_attribs ignores any text it does not know about, and again the choices are validated for validity against the jurisdiction_array when cc_js_rest_of_modify() is called. The jurisdiction is handled just like the version.
This is the only part of code flow that depends on values in the DOM that are intended to be controlled. If you modify cc_js_* attributes in other ways, you could perhaps make the code do unexpected things, but such access also includes the abilty override the cc_js_* functions, so we have no way to defend against that.
The complete.js file adds a <script> tag to the document it is included into, which causes the browser to request a static file, template.js (perhaps localized). This file document.write()s HTML to the page, but the validity of that HTML as XML has already been checked by the Python program that generated template.js.
The cc-jurisdictions.js file is generated nightly by Python. Only one section of the file is updated, and it is validated to be a single JSON object so the process by which the file is generated precludes it from modifying more than one variable. The variable it does modify is decoded from JSON and validated for equality to the Python object that generated it before the file is modified. That way, any invalid input during the generation of the jurisdiction array is caught before they are saved into a file that is served to web clients.
Client interaction and network activity
All LicenseChooser.js DOM calls start with references to the cc_js_$() function, which ensures that it only modifies DOM elements prefixed by cc_js_. By verifying this, you can see that we do not have access to user data other than the forms created by us. Relatedly, all CSS styles are prefixed with this, and all created DOM objects' class or ID values if present are prefixed with cc_js_. By verifying this, you can ensure that we do not create conflicts with your styles or elements.
The LicenseChooser.js JS files do no network activity once they have been downloaded to the client. By verifying this, you can see that we do not send any information (personally identifiable or otherwise) any CC.org servers except as part of downloading the widget.
complete.js takes an input variable of ?locale= in the query string. It matches this against the regex /^([a-zA-Z-_]+)$/ to ensure only alphanumeric characters and hyphens and underscores are part of the value (as well as that the value is not empty), and if that is true, it asks the client to requests a file with these characters suffixed onto a URL. The chosen characters do not need to be escaped, so no escaping is done.