RkBlog

Hardware, programming and astronomy tutorials and reviews.

Power in Google CSE - how to make Google Python search engine :)

Google has a public service called Custom Search Engine - CSE. It's not very popular - if you want google powered search for your site you use Google Ajax search or other services. Google CSE is much more c

Google has a public service called Custom Search Engine - CSE. It's not very popular - if you want google powered search for your site you use Google Ajax search or other services. Google CSE is much more complex, and it's hart to see the "cool part" until you get the dreamed result.

Google CSE allows you to tune the Google search results against some keywords, sites etc. If you want a search engine for Python frameworks then Rainhard Django or construction pylons aren't the things you look for...

Here is a XML config file for the Google CSE hosted on your site:
<?xml version="1.0" encoding="UTF-8" ?>
<GoogleCustomizations>
    <CustomSearchEngine>
        <Title>Python-Search</Title>
        <Description>Python-Search</Description>
        <Context>
           <BackgroundLabels>
             <Label name="pylabel" mode="BOOST" weight="0.8" />
          </BackgroundLabels>

        </Context>
    </CustomSearchEngine>
		<Include type="Annotations" href="http://www.google.com/cse/tools/makecse?url=http://www.google.com/search?q=python+$q&amp;btnG=Search&amp;num=100&amp;label=pylabel"/>
	<Annotations>
			<Annotation about="http://wiki.pylonshq.com/*">
				<Label name="pylabel"/>
			</Annotation>
			<Annotation about="http://pylonshq.com/*">
				<Label name="pylabel"/>
			</Annotation>
			<Annotation about="http://www.djangoproject.com/*">
				<Label name="pylabel"/>
			</Annotation>
			<Annotation about="http://code.djangoproject.com/*">
				<Label name="pylabel"/>
			</Annotation>
			<Annotation about="http://www.djangobook.com/*">
				<Label name="pylabel"/>
			</Annotation>
			<Annotation about="http://docs.python.org/lib/*">
				<Label name="pylabel"/>
			</Annotation>
			<Annotation about="http://docs.python.org/dev/*">
				<Label name="pylabel"/>
			</Annotation>
			<Annotation about="http://www.riverbankcomputing.co.uk/static/Docs/PyQt4/html/*">
				<Label name="pylabel"/>
			</Annotation>
		</Annotations>
</GoogleCustomizations>
I've used a label "pylabel" used in the background (used by default), that boosts results from sites listed in Annotations. I've added Pylons wiki, main page, and the same for Django, Python documentation. I've also added autogenerated annotations made from a google search for "SEARCH_STRING + Python" (so if you search for "pylons" results found on "pylons+python" search will be boosted). If I search for decorators I get slightly different results (left - CSE, right - plain Google):
cse1
It's not extremly impressive, but we can not only boost good sites, but limit the results to annotated sites. For example:
<?xml version="1.0" encoding="UTF-8" ?>
<GoogleCustomizations>
    <CustomSearchEngine>
        <Title>Python-Search</Title>
        <Description>Python-Search</Description>
        <Context>
           <BackgroundLabels>
             <Label name="pylabel" mode="BOOST" weight="0.8" />
          </BackgroundLabels>
	<Synonyms>
        <SynonymEntry word="pylons">
          <Synonym>pylons framework</Synonym>
        </SynonymEntry>
        <SynonymEntry word="pylons">
          <Synonym>pylons python</Synonym>
        </SynonymEntry>
     </Synonyms>

        </Context>
	<Context refinementsTitle="Search for $q in categories:">
     <Facet>
       <FacetItem title="Python">
         <Label name="pylabel" mode="BOOST" weight="0.8">
            <IgnoreBackgroundLabels>true</IgnoreBackgroundLabels>
         </Label>
       </FacetItem>
     </Facet>
      <Facet>
       <FacetItem title="Pylons">
         <Label name="pylons" mode="FILTER">
            <IgnoreBackgroundLabels>true</IgnoreBackgroundLabels>
         </Label>
       </FacetItem>
     </Facet>
     <Facet>
       <FacetItem title="Django">
         <Label name="django" mode="FILTER">
            <IgnoreBackgroundLabels>true</IgnoreBackgroundLabels>
         </Label>
       </FacetItem>
     </Facet>
     <Facet>
       <FacetItem title="Python Reference">
         <Label name="pyreference" mode="FILTER">
            <IgnoreBackgroundLabels>true</IgnoreBackgroundLabels>
         </Label>
       </FacetItem>
     </Facet>
     </Context>
    </CustomSearchEngine>
		<Include type="Annotations" href="http://www.google.com/cse/tools/makecse?url=http://www.google.com/search?q=python+language&amp;btnG=Search&amp;num=100&amp;label=pylabel"/>
		<Include type="Annotations" href="http://www.google.com/cse/tools/makecse?url=http://www.google.com/search?q=jython&amp;btnG=Search&amp;num=100&amp;label=pylabel"/>
		<Include type="Annotations" href="http://www.google.com/cse/tools/makecse?url=http://www.google.com/search?q=ironpython&amp;btnG=Search&amp;num=100&amp;label=pylabel"/>
		<Include type="Annotations" href="http://www.google.com/cse/tools/makecse?url=http://www.google.com/search?q=django+framework&amp;btnG=Search&amp;num=100&amp;label=pylabel"/>
		<Include type="Annotations" href="http://www.google.com/cse/tools/makecse?url=http://www.google.com/search?q=pylons+framework&amp;btnG=Search&amp;num=100&amp;label=pylabel"/>
		<Include type="Annotations" href="http://www.google.com/cse/tools/makecse?url=http://www.google.com/search?q=pyqt&amp;btnG=Search&amp;num=100&amp;label=pylabel"/>
		<Include type="Annotations" href="http://www.google.com/cse/tools/makecse?url=http://www.google.com/search?q=pygtk&amp;btnG=Search&amp;num=100&amp;label=pylabel"/>
		<Include type="Annotations" href="http://www.google.com/cse/tools/makecse?url=http://www.google.com/search?q=python+$q&amp;btnG=Search&amp;num=100&amp;label=pylabel"/>
		
		<Include type="Annotations" href="http://www.google.com/cse/tools/makecse?url=http://www.google.com/search?q=pylons+framework&amp;btnG=Search&amp;num=100&amp;label=pylons"/>
		<Include type="Annotations" href="http://www.google.com/cse/tools/makecse?url=http://www.google.com/search?q=pylons+python&amp;btnG=Search&amp;num=100&amp;label=pylons"/>
		<Include type="Annotations" href="http://www.google.com/cse/tools/makecse?url=http://www.google.com/search?q=pylons+python+$q&amp;btnG=Search&amp;num=100&amp;label=pylons"/>
		<Include type="Annotations" href="http://www.google.com/cse/tools/makecse?url=http://www.google.com/search?q=pylons+framework+$q&amp;btnG=Search&amp;num=100&amp;label=pylons"/>
		
		<Include type="Annotations" href="http://www.google.com/cse/tools/makecse?url=http://www.google.com/search?q=django+framework&amp;btnG=Search&amp;num=100&amp;label=django"/>
		<Include type="Annotations" href="http://www.google.com/cse/tools/makecse?url=http://www.google.com/search?q=django+python&amp;btnG=Search&amp;num=100&amp;label=django"/>
		<Include type="Annotations" href="http://www.google.com/cse/tools/makecse?url=http://www.google.com/search?q=django+framework+$q&amp;btnG=Search&amp;num=100&amp;label=django"/>
		<Include type="Annotations" href="http://www.google.com/cse/tools/makecse?url=http://www.google.com/search?q=django+python+$q&amp;btnG=Search&amp;num=100&amp;label=django"/>
	<Annotations>
			<Annotation about="http://docs.python.org/lib/*">
				<Label name="pylabel"/>
			</Annotation>
			<Annotation about="http://docs.python.org/dev/*">
				<Label name="pylabel"/>
			</Annotation>
			
			<Annotation about="http://wiki.pylonshq.com/*">
				<Label name="pylons"/>
			</Annotation>
			<Annotation about="http://pylonshq.com/*">
				<Label name="pylons"/>
			</Annotation>
			
			<Annotation about="http://docs.djangoproject.com/en/dev/*">
				<Label name="django"/>
			</Annotation>
			<Annotation about="http://code.djangoproject.com/wiki/*">
				<Label name="django"/>
			</Annotation>
			<Annotation about="http://www.djangobook.com/*">
				<Label name="django"/>
			</Annotation>
			
			<Annotation about="http://docs.python.org/lib/*">
				<Label name="pyreference"/>
			</Annotation>
			<Annotation about="http://docs.python.org/dev/*">
				<Label name="pyreference"/>
			</Annotation>
		</Annotations>
</GoogleCustomizations>
I've added two extra labes used "on demand", that return hits only from annotated sites (FILTER label). For Django I had to remove tickets from the annotations as they spammed the results :) Search for "decorators" under Pylons and Django labels give us more precise results:
cse2
cse3
RkBlog

21 September 2008;

Comment article