WebKit/Chrome and “Search Engine” Provider

Search Engine can be added through multiple ways. You can add it through Javascript and Extension.User Agent also learns about the search engine provider through some logic. We will learn about all this in this blog. You can access the list of registered “search engines” through chrome://settings/searchEngines. “Default search options” are included during the user agent installation. “Other search engines” are added through other mechanisms.

                                     image

We are going to see three different mechanism:
          1. Using JavaScript
          2. WebKit Self-learning – 1
          3. WebKit Self-learning – 2

1. Using JavaScript

    HTML File:
    <html>
        <title>
            AddSearchProvider Example
        </title>
        <script type="text/javascript">
            window.external.AddSearchProvider("http://hostname/evilsearch.xml");
        </script>
        <body>
            <h1>AddSearchProvider Example </h1>
        </body>
    </html>

    evilsearch.xml:
    <?xml version="1.0" encoding="UTF-8"?>
    <OpenSearchDescription xmlns="http://a9.com/-/spec/opensearch/1.1/">
        <ShortName>Evil Search</ShortName>
        <Description>Use Evilsearch.com to search the Web.</Description>
        <Tags>example web</Tags>
        <Contact>admin@Evilsearch.com</Contact>
        <Url type="text/html" template="http://www.Evilsearch.com/?q={searchTerms}&amp;pw={startPage?}"/>
    </OpenSearchDescription>

Browser will download http://hostname/evilsearch.xml that is a OPENSEARCH specification and ask user to add it. User can deny that. Note a important point that, you will not see this query in the "Debugging" window of the chrome browser. It means browser will not show all the requests in the log window.

                                  image

 

2. WebKit Self-learning – 1
                   Webkit based browsers learns about "search" feature in a webapp whenever it sees a form submission. Whenever you submit a form, WebKit based browsers do some heuristics to learn whether this form submission is a search query or not.

There is a changelog in chrome source code that talks about this feature addition:

2011-06-03  Philippe Beauchamp  <philippe.beauchamp@gmail.com>
        Reviewed by Dimitri Glazkov.

        Add the feature "Add as search engine…" in a search text field context menu for chromium
        https://bugs.webkit.org/show_bug.cgi?id=47980
        * public/WebContextMenuData.h:
        * public/WebSearchableFormData.h:
        * src/ContextMenuClientImpl.cpp:
        (WebKit::ContextMenuClientImpl::getCustomMenuFromDefaultItems):
        * src/WebSearchableFormData.cpp:
        (WebKit::WebSearchableFormData::WebSearchableFormData):

 

Code in chromium/src/third_party/WebKit/Source/WebKit/chromium/src/WebSearchableFormData.cpp implements this detection technique.

WebSearchableFormData::WebSearchableFormData(const WebFormElement& form, const WebInputElement& selectedInputElement)
{
    RefPtr<HTMLFormElement> formElement = form.operator PassRefPtr<HTMLFormElement>();
    HTMLInputElement* inputElement = selectedInputElement.operator PassRefPtr<HTMLInputElement>().get();

    // Only consider forms that GET data.
    // Allow HTTPS only when an input element is provided.
    if (equalIgnoringCase(formElement->getAttribute(methodAttr), "post")
        || (!IsHTTPFormSubmit(formElement.get()) && !inputElement))
        return;

    Vector<char> encodedString;
    TextEncoding encoding;

    GetFormEncoding(formElement.get(), &encoding);
    if (!encoding.isValid()) {
        // Need a valid encoding to encode the form elements.
        // If the encoding isn’t found webkit ends up replacing the params with
        // empty strings. So, we don’t try to do anything here.
        return;
    }

    // Look for a suitable search text field in the form when a
    // selectedInputElement is not provided.
    if (!inputElement) {
        inputElement = findSuitableSearchInputElement(formElement.get());

        // Return if no suitable text element has been found.
        if (!inputElement)
            return;
    }

    HTMLFormControlElement* firstSubmitButton = GetButtonToActivate(formElement.get());
    if (firstSubmitButton) {
        // The form does not have an active submit button, make the first button
        // active. We need to do this, otherwise the URL will not contain the
        // name of the submit button.
        firstSubmitButton->setActivatedSubmit(true);
    }

    bool isValidSearchString = buildSearchString(formElement.get(), &encodedString, &encoding, inputElement);

    if (firstSubmitButton)
        firstSubmitButton->setActivatedSubmit(false);

    // Return if the search string is not valid.
    if (!isValidSearchString)
        return;

    String action(formElement->action());
    KURL url(formElement->document()->completeURL(action.isNull() ? "" : action));
    RefPtr<FormData> formData = FormData::create(encodedString);
    url.setQuery(formData->flattenToString());
    m_url = url;
    m_encoding = String(encoding.name());
}

// Look for a suitable search text field in a given HTMLFormElement
// Return nothing if one of those items are found:
//  – A text area field
//  – A file upload field
//  – A Password field
//  – More than one text field
HTMLInputElement* findSuitableSearchInputElement(const HTMLFormElement* form)
{
    HTMLInputElement* textElement = 0;
    // FIXME: Consider refactoring this code so that we don’t call form->associatedElements() twice.
    for (Vector<FormAssociatedElement*>::const_iterator i(form->associatedElements().begin()); i != form->associatedElements().end(); ++i) {
        if (!(*i)->isFormControlElement())
            continue;

        HTMLFormControlElement* formElement = static_cast<HTMLFormControlElement*>(*i);

        if (formElement->disabled() || formElement->name().isNull())
            continue;

        if (!IsInDefaultState(formElement) || formElement->hasTagName(HTMLNames::textareaTag))
            return 0;

        if (formElement->hasTagName(HTMLNames::inputTag) && formElement->willValidate()) {
            const HTMLInputElement* input = static_cast<const HTMLInputElement*>(formElement);

            // Return nothing if a file upload field or a password field are found.
            if (input->isFileUpload() || input->isPasswordField())
                return 0;

            if (input->isTextField()) {
                if (textElement) {
                    // The auto-complete bar only knows how to fill in one value.
                    // This form has multiple fields; don’t treat it as searchable.
                    return 0;
                }
                textElement = static_cast<HTMLInputElement*>(formElement);
            }
        }
    }
    return textElement;
}

The following PoC will pass this heuristic and shows the "Add as search engine" menu option.

<html>
<body>
<form action=’/search’ method=’get’>
    <input autocomplete=’off’ id=’q’ name=’q’ type=’text’ value=’Search’>
    <input name="commit" type="submit" value="Go" />
</form>
<body>
</html>

                     image 

                     image

 

3. WebKit Self-learning – 2

WebKit based browsers learns about the "Search" feature and adds it automatically in the search engine list without user consent. [IMPORTANT]If the search form is placed in the main page then it will be added to the list of "search providers". For example, when a user uses the search feature in http://hostname/ (without any filename) , User Agent will detect this and construct the original URL needed to do the search and add it in the list of "search providers".

index.html
<!DOCTYPE html>
<html>
    <body>
        <form action=’/search.html’ method=’get’>
            <input id=’q’ name=’q’ type=’text’ value=’afasdf’>
            <input name="commit" type="submit" value="Go" />
        </form>
    </body>
</html>

Request for /search.html can return any data but it should return atleast valid HTTP response.

User Agent will add this “search engine” under “Other Search Engines”.

Advertisements
This entry was posted in Chrome, Web, webKit and tagged , . Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s