Tricky getting this to work on an existing site. Here goes...
The trouble is when you have an existing site with many pdf's already in the sites. The indexer will NOT scan the already existing docs. Here is the workaround.
In the central admin...
- Operations > Services On Server > Stop "Windows SharePoint Services Search". Click ok to delete the index.
- Go to your SQL db and delete the the WSS search database (you may need to recycle the SharePoint app pool in iis to release it from SharePoint).
- Go back to Operations > Services On Server > And Start "Windows SharePoint Services Search". The instant that the "Operation In Progress" page is done....
- If you have already installed the PDF filter go to step 5 otherwise do the following:
- Download and then install the Adobe PDF IFilter from the following Adobe Web site: (http://www.adobe.com/support/downloads/detail.jsp?ftpID=2611)
Microsoft provides third-party contact information to help you find technical support. This contact information may change without notice. Microsoft does not guarantee the accuracy of this third-party contact information.
- Add the following registry entry, and then set the registry entry value to pdf:
HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Shared Tools\Web Server Extensions\12.0\Search\Applications\\Gather\Search\Extensions\ExtensionList\38
To do this, follow these steps:
- Click Start, click Run, type regedit, and then click OK.
- Locate and then click the following registry subkey:
- HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Shared Tools\Web Server Extensions\12.0\Search\Applications\GUID\Gather\Search\Extensions\ExtensionList
- On the Edit menu, point to New, and then click String Value.
d. Type 38, and then press ENTER.
- Right-click the registry entry that you created, and then click Modify.
- In the Value data box, type pdf, and then click OK.
- Verify that the following two registry subkeys are present and that they contain the appropriate values.
Note These registry subkeys and the values that they contain are created when you installed the Adobe PDF IFilter on the server.
- HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Shared Tools\Web Server Extensions\12.0\Search\Setup\ContentIndexCommon\Filters\Extension\.pdf
- The above registry subkey must contain the following registry entry:
- Name: Default
Type: REG_MULTI_SZ
Data: {4C904448-74A9-11D0-AF6E-00C04FD8DC02}
- HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Shared Tools\Web Server Extensions\12.0\Search\Setup\Filters\.pdf
- The above registry subkey must contain the following registry entries:
- Name: Default
Type: REG_SZ
Data: (value not set) Name: Extension
Type: REG_SZ
Data: pdf
- Name: FileTypeBucket
Type: REG_DWORD
Data: 0x00000001 (1)
- Name: MimeTypes
Type: REG_SZ
Data: application/pdf
- You need to recreate the "38" key (it is deleted when the index was deleted --this is a WWS 3.0 search bug-- )...
- Add the following registry entry, and then set the registry entry value to pdf:
HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Shared Tools\Web Server Extensions\12.0\Search\Applications\\Gather\Search\Extensions\ExtensionList\38
To do this, follow these steps:
- Click Start, click Run, type regedit, and then click OK.
- Locate and then click the following registry subkey:
HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Shared Tools\Web Server Extensions\12.0\Search\Applications\GUID\Gather\Search\Extensions\ExtensionList
- On the Edit menu, point to New, and then click String Value.
- Type 38, and then press ENTER.
- Right-click the registry entry that you created, and then click Modify.
- In the Value data box, type pdf, and then click OK.
- Run "net stop spsearch" and then "net start spsearch" from command line.
- Go to Application Management > Content Databases > Select your content db and set your search search server.
- You are done.
Summary:
The problem is that when you stop the service the registry entries are dropped for the ExtensionList. Then once the service is restarted, the list is reset to the default ifilter and your custom ifilters (pdf's, etc) are dropped, thus your site is not scanned for your custom ifilter types.
If there is a way to consolidate all or most of these steps into a command line script please post. :)
Here is another option I found on the internet...
I have seen it mentioned that if you already have PDF files in a document library and then add the Adobe iFilter you have to re-add the documents to the library before they will be indexed. Fortunately this is not true. SQL Server has some system stored procedures to manage indexing.
Use SQL Query Analyser to run the following command (after installing your iFilter):
-
USE Name_of_your_WSS_content_db
EXEC sp_fulltext_catalog 'ix_STS_servername_xxxxxx', 'rebuild'
- You will find the correct string for 'ix_STS_servername_xxxxxx' by using SQL Server Enterprise Manager.
- Expand the WSS content database and click on Full-TextCatalogs.
No restart of any services was required for me to be able to search on PDF contents.