|
|
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN">
<html>
<head> <title>Microsoft Index Server Guide: Internet Data Query Files</title> <meta name="FORMATTER" content="Microsoft FrontPage 1.1"> <meta name="GENERATOR" content="Microsoft FrontPage 1.1"> </head>
<body bgcolor="#FFFFFF"> <!--Headerbegin--><p align=center><a name="TOP"><img src="onepix.gif" alt="Space" align=middle width=1 height=1></a> <a href="default.htm#Top"><img src="toc.gif" alt=" Contents" align=middle border=0 width=89 height=31></a> <a href="webhits.htm"><img src="previous.gif" alt="Previous" align=middle border=0 width=32 height=31></a> <a href="htxhelp.htm"><img src="next.gif" alt="Next" align=middle border=0 width=32 height=31></a> </p> <hr> <!--Headerend--><p><a name="InternetDataQueryFiles"><font size=6><strong>Internet Data Query Files</strong></font></a></p> <p align=left><!--Chaptoc--></p> <blockquote> <p><a href="idqhelp.htm#namesection">Names Section</a> <br> <a href="idqhelp.htm#qrysection">Query Section</a> <br> <a href="idqhelp.htm#SeqQuery">Effect of Parameters on Query Performance</a> <br> </p> </blockquote> <hr> <!--ChaptocEnd--><p>Internet Data Query files (files with an .idq extension) for Microsoft Index Server (together with the form parameters) specify the query that Microsoft Index Server will run. The .idq file is divided into two sections, the <a href="#namesection">names section</a> and the <a href="#qrysection">query section</a>. The names section is optional, and need not be supplied for standard queries. </p> <p><a name="IDQ-HTXPath"><strong>Note</strong></a><strong>   </strong>All paths to .idq files must be the full path name from a virtual root, not a relative path or a physical path. In other words, all paths must start with a slash and cannot contain “.” or “..” components. See the following examples:</p> <blockquote> <p><strong>Valid Paths</strong><br> /scripts/myquery.idq<br> /scripts/samples/search/query.idq</p> </blockquote> <blockquote> <p><strong>Invalid Paths</strong><br> c:\inetsrv\scripts\myquery.idq<br> scripts/query.idq <br> /samples/../scripts/query.idq <br> </p> </blockquote> <p>The .idq files cannot be on a virtual root pointing to a remote Uniform Naming Convention (UNC) share.</p> <hr> <h1><a href="#TOP"><img src="up.gif" alt="To Top" align=middle border=0 width=14 height=11></a><a name="namesection">Names Section</a></h1> <p>The names section of the Internet Data Query file defines nonstandard column names that can be referred to in the query. The columns refer to ActiveX™ properties that have been created in document files with IPropertyStorage, or in the Microsoft® Office summary and custom properties. The globally unique identifier (GUID) for Microsoft Office is 0xF29F85E0,0x4FF9,0x1068,0xAB9108002B27B3D9. The following sample defines a few of the ActiveX Summary Information properties: </p> <blockquote> <pre><font size=3><tt>[Names] #Property set for OLE document properties DocTitle = F29F85E0-4FF9-1068-AB91-08002B27B3D9 2 DocSubject( DBTYPE_STR|DBTYPE_BYREF ) = F29F85E0-4FF9-1068-AB91-08002B27B3D9 3 DocAuthor( DBTYPE_STR|DBTYPE_BYREF ) = F29F85E0-4FF9-1068-AB91-08002B27B3D9 4 DocEditTime( DBTYPE_DATE ) = F29F85E0-4FF9-1068-AB91-08002B27B3D9 0xa DocLastPrinted( DBTYPE_DATE ) = F29F85E0-4FF9-1068-AB91-08002B27B3D9 0xb DocPageCount( DBTYPE_I4 ) = F29F85E0-4FF9-1068-AB91-08002B27B3D9 0xe DocWordCount( DBTYPE_I4 ) = F29F85E0-4FF9-1068-AB91-08002B27B3D9 0xf SalesRegion( DBTYPE_WSTR | DBTYPE_BYREF ) = D5CDD505-2E9C-101B-9397-08002B2CF9AE "SalesRegion"</tt></font></pre> </blockquote> <p>Within the section, any blank line, or a line beginning with a number sign (<tt>#</tt>) is ignored. Other lines consist of a <em>friendly name</em>, optionally followed by a <em>datatype</em> in parenthesis, followed by an equal sign (=), then a <a href="glossary.htm#GUID">GUID</a> identifying the <em>property set</em> for the column, followed by either a number or a string giving the <a href="glossary.htm#PROPID">PROPID</a> or the <em>property name,</em> respectively. If no datatype is provided, DBTYPE_WSTR is assumed.</p> <p>The friendly name is the token in query restrictions, sort specifications, and so on. Multiple friendly names can point to the same property. For example, the friendly name “Author” might be replaced by “Auteur” if an author property is to be shown to a French audience. Friendly names cannot contain spaces or special characters such as angle brackets, equal signs, exclamation points, commas, periods, and asterisks (>=<!,.*).</p> <p>The GUID and PROPID/property name is the name of the property within the ActiveX property namespace. See the Win32 Software Development Kit (SDK) for more information on ActiveX properties. The PROPID may be specified as a decimal (base 10) or in hexadecimal (base 16) number. In the latter case, the number must be preceded by 0x. Property names must be enclosed in quotation marks. For example, “10” is not the same as 10.</p> <p>The datatype is used during restriction parsing to correctly interpret user input. The following table lists the datatypes supported, their equivalent ActiveX mnemonics, and any formatting restrictions.</p> <div align=left> <table border=1 cellpadding=5 cellspacing=0 width=100%> <tr><th align=left valign=bottom width=15%><font size=2>Datatype</font></th><th align=left valign=bottom width=15%><font size=2>ActiveX mnemonics</font></th><th align=left valign=bottom width=70%><font size=2>Formatting restrictions</font></th></tr> <tr><td valign=top width=15%><font size=2>DBTYPE_I1</font></td><td valign=top width=15%><font size=2>VT_I1</font></td><td valign=top width=70%><font size=2>Integer. Expressed in either decimal (base 10) or hexadecimal (base 16) notation. The latter requires 0x before the number, for example, 0x3F8.</font></td></tr> <tr><td valign=top width=15%><font size=2>DBTYPE_UI1</font></td><td valign=top width=15%><font size=2>VT_UI1</font></td><td valign=top width=70%><font size=2>Integer. Expressed in either decimal (base 10) or hexadecimal (base 16) notation. The latter requires 0x before the number.</font></td></tr> <tr><td valign=top width=15%><font size=2>DBTYPE_I2</font></td><td valign=top width=15%><font size=2>VT_I2</font></td><td valign=top width=70%><font size=2>Integer. Expressed in either decimal (base 10) or hexadecimal (base 16) notation. The latter requires 0x before the number.</font></td></tr> <tr><td valign=top width=15%><font size=2>DBTYPE_UI2</font></td><td valign=top width=15%><font size=2>VT_UI2</font></td><td valign=top width=70%><font size=2>Integer. Expressed in either decimal (base 10) or hexadecimal (base 16) notation. The latter requires 0x before the number.</font></td></tr> <tr><td valign=top width=15%><font size=2>DBTYPE_I4</font></td><td valign=top width=15%><font size=2>VT_I4</font></td><td valign=top width=70%><font size=2>Integer. Expressed in either decimal (base 10) or hexadecimal (base 16) notation. The latter requires 0x before the number.</font></td></tr> <tr><td valign=top width=15%><font size=2>DBTYPE_UI4</font></td><td valign=top width=15%><font size=2>VT_UI4</font></td><td valign=top width=70%><font size=2>Integer. Expressed in either decimal (base 10) or hexadecimal (base 16) notation. The latter requires 0x before the number.</font></td></tr> <tr><td valign=top width=15%><font size=2>DBTYPE_I8</font></td><td valign=top width=15%><font size=2>VT_I8</font></td><td valign=top width=70%><font size=2>Integer. Expressed in either decimal (base 10) or hexadecimal (base 16) notation. The latter requires 0x before the number.</font></td></tr> <tr><td valign=top width=15%><font size=2>DBTYPE_UI8</font></td><td valign=top width=15%><font size=2>VT_UI8</font></td><td valign=top width=70%><font size=2>Integer. Expressed in either decimal (base 10) or hexadecimal (base 16) notation. The latter requires 0x before the number.</font></td></tr> <tr><td valign=top width=15%><font size=2>DBTYPE_R4</font></td><td valign=top width=15%><font size=2>VT_R4</font></td><td valign=top width=70%><font size=2>Real number. Can be expressed in scientific notation.</font></td></tr> <tr><td valign=top width=15%><font size=2>DBTYPE_R8</font></td><td valign=top width=15%><font size=2>VT_R8</font></td><td valign=top width=70%><font size=2>Real number. Can be expressed in scientific notation.</font></td></tr> <tr><td valign=top width=15%><font size=2>DBTYPE_CY</font></td><td valign=top width=15%><font size=2>VT_CY</font></td><td valign=top width=70%><font size=2>Currency. Expressed as two integers, separated by a period, for example, 100.55. Cannot be preceded by $, ¥, £, and so on. This datatype does not specify the currency format.</font></td></tr> <tr><td valign=top width=15%><font size=2>DBTYPE_DATE</font></td><td valign=top width=15%><font size=2>VT_DATE</font></td><td valign=top width=70%><font size=2>Date. Expressed as an absolute in two forms: <em>yyyy/mm/dd</em> and <em>yyyy/mm/dd hh:mm:ss</em>. Also expressed as a relative date: -#y, -#m, -#w, -#d, -#h, -#n, -#s where the letters correspond to year, month, week, day, hour, minute and second, respectively. Positive relative dates into the future are not supported.</font></td></tr> <tr><td valign=top width=15%><font size=2>DBTYPE_BOOL</font></td><td valign=top width=15%><font size=2>VT_BOOL</font></td><td valign=top width=70%><font size=2>Boolean. Expressed as TRUE or FALSE.</font></td></tr> <tr><td valign=top width=15%><font size=2>DBTYPE_STR</font></td><td valign=top width=15%><font size=2>VT_LPSTR</font></td><td valign=top width=70%><font size=2>String. Any input accepted.</font></td></tr> <tr><td valign=top width=15%><font size=2>DBTYPE_WSTR</font></td><td valign=top width=15%><font size=2>VT_LPWSTR</font></td><td valign=top width=70%><font size=2>Unicode string. Any input accepted.</font></td></tr> <tr><td valign=top width=15%><font size=2>DBTYPE_BSTR</font></td><td valign=top width=15%><font size=2>VT_BSTR</font></td><td valign=top width=70%><font size=2>Basic string. Any input accepted.</font></td></tr> <tr><td valign=top width=15%><font size=2>DBTYPE_GUID</font></td><td valign=top width=15%><font size=2>VT_CLSID</font></td><td valign=top width=70%><font size=2>GUID (Globally Unique IDentifier). Expressed as <em>xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx.</em></font></td></tr> <tr><td valign=top width=15%><font size=2>DBTYPE_BYREF</font></td><td valign=top width=15%><font size=2>(not applicable)</font></td><td valign=top width=70%><font size=2>Older operator. Should be added to strings. For example: DBTYPE_WSTR | DBTYPE_BYREF.</font></td></tr> <tr><td valign=top width=15%><font size=2>DBTYPE_VECTOR</font></td><td valign=top width=15%><font size=2>VT_VECTOR</font></td><td valign=top width=70%><font size=2>Older operator. Vector properties are fully supported.</font></td></tr> <tr><td><font size=2>VT_FILETIME</font></td><td><font size=2>VT_FILETIME</font></td><td><font size=2>Expressed as an absolute in two forms: <em>yyyy/mm/dd</em> and <em>yyyy/mm/dd hh:mm:ss</em>. Also expressed as a relative date: -#y, -#m, -#w, -#d, -#h, -#n, -#s where the letters correspond to year, month, week, day, hour, minute and second, respectively. Positive relative dates into the future are not supported.</font></td></tr> </table> </div> <p>The friendly names are always available, even if they are not explicitly defined in the names section. See <a href="qrylang.htm#PropertyNamesList">List of Property Names</a> on the “Query Language” page. For other Microsoft Office properties, see the Microsoft Office Software Developer’s Kit (SDK). For properties available with other products, see the documentation for each independent software vendor.</p> <p>The <a href="filtrhlp.htm">HTML filter</a> extracts text from the content field of a meta element. For example, if an HTML file has this line:</p> <blockquote> <pre><META NAME="DESCRIPTION" CONTENT="Sample query form for Microsoft Index Server"></pre> </blockquote> <p>Then a user can query the information in the content field, namely “Sample query form for Microsoft Index Server”, by using the HTML meta property. The <a href="glossary.htm#GUID">GUID</a> for the meta property is D1B5D3F0-C0B3-11CF-9A92-00A0C908DBF1 and the property name is specified by the name field, or the HTTP-EQUIV field. In the above example, the property name is <tt>DESCRIPTION</tt>. Thus a friendly name, say MetaDescription, for the meta property can be defined as </p> <blockquote> <pre>MetaDescription(DBTYPE_WSTR) = D1B5D3F0-C0B3-11CF-9A92-00A0C908DBF1 description</pre> </blockquote> <p>The GUID for meta property is a registry parameter located at </p> <blockquote> <pre>HKEY_LOCAL_MACHINE  \System   \CurrentControlSet    \Control\HtmlFilter     \MetaTagClsid</pre> </blockquote> <hr> <h1><a href="#TOP"><img src="up.gif" alt="To Top" align=middle border=0 width=14 height=11></a><a name="qrysection">Query Section</a> </h1> <p>The query section of the .idq file specifies parameters that will be used in the query. It can refer to form variables and can include conditional expressions to set a variable to alternative values depending upon some condition. The section begins with a [Query] tag, and is followed by a set of parameters. Here is a simple .idq file:</p> <blockquote> <pre>[Query] CiScope=/ CiColumns=FileName CiRestriction=#filename *.* CiTemplate=/Scripts/Template.htx</pre> </blockquote> <p>The preceding four parameters are required. In many cases, one or more parameters will be passed down from a form. Here is a very simple form:</p> <blockquote> <pre><FORM ACTION="/scripts/simple.idq" METHOD="GET"> Query : <INPUT TYPE="TEXT" NAME="Restriction" SIZE="60" MAXLENGTH="100" VALUE=""> <INPUT TYPE="SUBMIT" VALUE="Execute Query"> </FORM></pre> </blockquote> <p>This form can work with the following .idq file to pass parameters through from the user:</p> <blockquote> <pre>[Query] CiScope=/ CiColumns=FileName CiRestriction=%Restriction% CiTemplate=/Scripts/Template.htx</pre> </blockquote> <p><a href="htxhelp.htm#ifstatement">Conditional expressions</a> can also be used in .idq files in exactly the same manner as .htx files. In addition to the four parameters shown earlier, there are many other optional parameters. Common additions include <a href="idq-vars.htm#CiSort"><strong>CiSort</strong></a> and <a href="idq-vars.htm#CiForceUseCi"><strong>CiForceUseCi</strong></a>. See the <a href="idq-vars.htm#idqvars">full list</a> of additions.</p> <hr> <p><font color="#FF0000"><strong>Warning</strong></font><strong>   </strong>Be careful when substituting parameters for the <a href="idq-vars.htm#CiTemplate"><strong>CiTemplate</strong></a> parameter because you could unintentionally allow files in execute-only scripts directories to be sent over the network. For example, if an .idq file contained the line </p> <blockquote> <pre>CiTemplate=%CiTemplate%</pre> </blockquote> <p>a client could send a URL that contained the following line in the query string:</p> <blockquote> <pre>CiTemplate=/scripts/mysecretfile.pl</pre> </blockquote> <p>With this string, an unauthorized user could read the contents of a confidential file.</p> <p>It is better to switch among different. htx files by just using the base name of the file and adding the script directory and file name extension in the parameter substitution. The following file, Sample.idq, shows how to do this:</p> <blockquote> <pre>[Query]</pre> <blockquote> <pre>CiRestriction=%q% CiTemplate=/scripts/%t%.htx CiSort=%s% CiScope=/</pre> </blockquote> </blockquote> <p>The query can be executed with a URL like the following:</p> <blockquote> <pre>http://<em>computername</em>/scripts/sample.idq?q=ActiveX&t=form1</pre> </blockquote> <hr> <h1><a href="#TOP"><img src="up.gif" alt="To Top" align=middle border=0 width=14 height=11></a><a name="SeqQuery">Effect of Parameters on Query Performance</a></h1> <p>The fastest query is a <em>sequential</em> query that uses the <em>content index</em>. Certain parameter settings will force the query engine to use a less efficient method to resolve the query. To guarantee fast queries, set <a href="idq-vars.htm#CiSort"><strong>CiSort</strong></a> to nothing (or descending by rank) set <a href="idq-vars.htm#CiForceUseCi"><strong>CiForceUseCi</strong></a> to TRUE, and do not reference <a href="idq-vars.htm#CiMatchedRecordCount"><b>CiMatchedRecordCount</b></a>, <a href="idq-vars.htm#CiRecordsNextPage"><b>CiRecordsNextPage</b></a><b>,</b> or <a href="idq-vars.htm#CiTotalNumberPages"><b>CiTotalNumberPages</b></a> in the .htx template.</p> <p><strong>Note:</strong> A Uniform Resource Locator (URL) or a form-based query can send up to 4 kilobytes (K) of data. If a query larger than 4K is sent, the behavior is unpredictable. The query size includes all variables sent from the browser to the .idq file.</p> <h2>Sequential versus Nonsequential Execution</h2> <p>A query can be executed sequentially (results fetched as needed) or it can be executed nonsequentially (results cached on the server). A sequential query requires fewer server resources, but also has some limitations. Backwards scrolling (<a href="idq-vars.htm#CiBookmarkSkipCount"><strong>CiBookmarkSkipCount</strong></a> < 0) will re-execute the query and scroll forward to the specified position. Sequential queries cannot refer to the following variables: <a href="idq-vars.htm#CiMatchedRecordCount"><b>CiMatchedRecordCount</b></a>, <a href="idq-vars.htm#CiRecordsNextPage"><b>CiRecordsNextPage</b></a>, and <a href="idq-vars.htm#CiTotalNumberPages"><b>CiTotalNumberPages</b></a><strong>.</strong></p> <p>Either of the following actions will force a query to be nonsequential:</p> <ul> <li><p align=left>Referring to the <a href="idq-vars.htm#CiMatchedRecordCount"><b>CiMatchedRecordCount</b></a>, <a href="idq-vars.htm#CiRecordsNextPage"><b>CiRecordsNextPage</b></a> or <a href="idq-vars.htm#CiTotalNumberPages"><b>CiTotalNumberPages</b></a> variables in the .htx page.</p> </li> <li><p align=left>Setting the <a href="idq-vars.htm#CiSort"><strong>CiSort</strong></a> parameter to a sort other than the native order of the result. <strong>CiSort</strong> can safely be set to sort ascending on WorkId (<tt>CiSort=WorkId[a]</tt>) or descending on Rank (<tt>CiSort=Rank[d]</tt>).</p> </li> </ul> <h2>Enumerated versus Indexed Resolution</h2> <p>Executing queries that must be <em>enumerated</em> can also slow down performance. Most queries are resolved by using the content index, but certain conditions force the query engine to recursively search the disk to locate matching files. These queries include:</p> <ul> <li><p align=left>Regular expressions on properties other than FileName which begin with a wildcard.</p> </li> <li><p align=left><a href="qrylang.htm#PropertyValueQueries">Property value queries</a> when <a href="idq-vars.htm#CiForceUseCi"><strong>CiForceUseCi</strong></a> is set to FALSE and the index is not up-to-date.</p> </li> <li><p align=left><a href="qrylang.htm#PropertyValueQueries">Property value queries</a> involving regular expressions with a wildcard prefix on a property other than <strong>FileName</strong> (for example, <tt>#DocAuthor *son</tt>).</p> </li> <li><p align=left><a href="qrylang.htm#PropertyValueQueries">Property value queries</a> involving regular expressions that start and end with wildcards (for example, <tt>#filename *sample*</tt>).</p> </li> <li><p align=left>Certain property value queries involving <strong>OR</strong> (such as <tt>@write > -1d OR @create > -1d</tt>).</p> </li> </ul> <p>Queries can be forced to use the content index by setting <a href="idq-vars.htm#CiForceUseCi"><strong>CiForceUseCi</strong></a> to <strong>TRUE</strong> in the .idq file. The query engine will always use the content index, but query results may be out-of-date for recently modified files. If the content index was used for a query, and some files on disk have been modified more recently than their contents have been filtered, the built-in variable <a href="idq-vars.htm#CiOutOfDate"><b>CiOutOfDate</b></a> will be set to the value 1. In some cases, a query is simply too complex to be resolved solely through use of the content index. In these cases, the built-in variable <a href="idq-vars.htm#CiQueryIncomplete"><strong>CiQueryIncomplete</strong></a> will be set to 1. Content queries can always be out of date and can use the content index anytime.</p> <h2>Deferring Nonindexed Trimming</h2> <p>Special support has been put in Index Server to optimize content queries that are sorted descending by rank (CiSort = Rank[d]). For such queries, minimal information can be retrieved from the index, before additional property and security tests are performed. However, if the total number of results matching the query is greater than <a href="idq-vars.htm#CiMaxRecordsInResultSet"><strong>CiMaxRecordsInResultSet</strong></a> then additional testing must be performed during index retrieval to remove items from this set that fail additional property and security tests. This frees up space in the result set for items matching the full query. This processing uses up resources, and can be deferred by setting <a href="idq-vars.htm#CiDeferNonIndexedTrimming"><strong>CiDeferNonIndexedTrimming</strong></a> to <strong>TRUE</strong>. The query will then pick <a href="idq-vars.htm#CiMaxRecordsInResultSet"><strong>CiMaxRecordsInResultSet</strong></a> items first, and trim those. The end result may be a number of matching items less than <a href="idq-vars.htm#CiMaxRecordsInResultSet"><strong>CiMaxRecordsInResultSet</strong></a>. For queries with the scope set to the entire corpus, on a server with little or no security, you can consider setting <a href="idq-vars.htm#CiDeferNonIndexedTrimming"><strong>CiDeferNonIndexedTrimming</strong></a> to <strong>TRUE</strong> to improve performance.</p> <!--Footerbegin--><hr> <p align=center><a href="default.htm#Top"><img src="toc.gif" alt=" Contents" align=middle border=0 width=89 height=31></a> <a href="webhits.htm"><img src="previous.gif" alt="Previous" align=middle border=0 width=32 height=31></a> <a href="#TOP"><img src="up_end.gif" alt="To Top" align=middle border=0 width=32 height=31></a> <a href="htxhelp.htm"><img src="next.gif" alt="Next" align=middle border=0 width=32 height=31></a> </p> <hr> <p align=center><em>© 1996 by Microsoft Corporation. All rights reserved.<!--Footerend--></em></p> </body>
</html>
|