BEGIN HEADER
source:http://shuetech.com/minetheweb/demo/docs/mocksites/quotes/index.htm
BEGIN INFORMATION
BEGIN ACTION
startafter:
bgcolor="white" border="0" cellpadding="0" cellspacing="0" width="100%"><tbody>|
endat:
<!-- Others -->|
pattern:
<tr align="right">
<td class="yfnc_mktsumtxt" colspan="1" align="left" nowrap="nowrap"><a href="http://finance.yahoo.com/q?s=%5E$1">$2</a></td>
<td class="yfnc_mktsumtxt" nowrap="nowrap">$3</td>
<td class="yfnc_mktsumtxt" nowrap="nowrap">$4</td>
<td class="yfnc_mktsumtxt" nowrap="nowrap">$5</td>
</tr>
|
definition:
$1:SYMBOL:TEXT
$2:INDEX:TEXT
$3:CLOSING:TEXT
$4:CHANGE:TEXT:StripHTMLTags()
$5:CHANGEPCT:TEXT:StripHTMLTags()
BEGIN DO
<HTML> <BODY> Data line 1 Data line 2 Data line 3 </BODY> </HTML>
startafter: <BODY>| endat: </BODY>|
and ends right beforebgcolor="white" border="0" cellpadding="0" cellspacing="0" width="100%"><tbody>
<!-- Others -->
So to tell Mine The Web to look only at the HTML contents between those 2 strings and ignore everything else, we say
startafter:
bgcolor="white" border="0" cellpadding="0" cellspacing="0" width="100%"><tbody>|
endat:
<!-- Others -->|
<tr align="right">
<td class="yfnc_mktsumtxt" colspan="1" align="left" nowrap="nowrap"><a href="http://finance.yahoo.com/q?s=%5EDJI">Dow</a></td>
<td class="yfnc_mktsumtxt" nowrap="nowrap">10,402.77</td>
<td class="yfnc_mktsumtxt" nowrap="nowrap"><span class="pos">+172.82</span></td>
<td class="yfnc_mktsumtxt" nowrap="nowrap"><span class="pos">(+1.69%)</span></td>
</tr>
<tr align="right">
<td class="yfnc_mktsumtxt" colspan="1" align="left" nowrap="nowrap"><a href="http://finance.yahoo.com/q?s=%5EIXIC">Nasdaq</a></td>
<td class="yfnc_mktsumtxt" nowrap="nowrap">2,089.88</td>
<td class="yfnc_mktsumtxt" nowrap="nowrap"><span class="pos">+26.07</span></td>
<td class="yfnc_mktsumtxt" nowrap="nowrap"><span class="pos">(+1.26%)</span></td>
</tr>
<tr align="right">
<td class="yfnc_mktsumtxt" colspan="1" align="left" nowrap="nowrap"><a href="http://finance.yahoo.com/q?s=%5EGSPC">S&P 500</a></td>
<td class="yfnc_mktsumtxt" nowrap="nowrap">1,198.41</td>
<td class="yfnc_mktsumtxt" nowrap="nowrap"><span class="pos">+19.51</span></td>
<td class="yfnc_mktsumtxt" nowrap="nowrap"><span class="pos">(+1.65%)</span></td>
</tr>
Notice that each <tr> holds one market quote. We want to extract the useful information within each <tr>...</tr> and discard the contents that we don't
want such as HTML tags. The scripting language has a very flexible way of telling Mine The Web what it is that you want
to keep and what you don't want to keep.
pattern:
<tr align="right">
<td class="yfnc_mktsumtxt" colspan="1" align="left" nowrap="nowrap"><a href="http://finance.yahoo.com/q?s=%5E$1">$2</a></td>
<td class="yfnc_mktsumtxt" nowrap="nowrap">$3</td>
<td class="yfnc_mktsumtxt" nowrap="nowrap">$4</td>
<td class="yfnc_mktsumtxt" nowrap="nowrap">$5</td>
</tr>
|
definition: $1:SYMBOL:TEXT $2:INDEX:TEXT $3:CLOSING:FLOAT $4:CHANGE:FLOAT:StripHTMLTags() $5:CHANGEPCT:FLOAT:StripHTMLTags()