Java Code Examples for org.apache.nutch.crawl.CrawlDatum#getScore()
The following examples show how to use
org.apache.nutch.crawl.CrawlDatum#getScore() .
You can vote up the ones you like or vote down the ones you don't like,
and go to the original project or source file by following the links above each example. You may check out the related API usage on the sidebar.
Example 1
Source File: OPICScoringFilter.java From anthelion with Apache License 2.0 | 5 votes |
/** Increase the score by a sum of inlinked scores. */ public void updateDbScore(Text url, CrawlDatum old, CrawlDatum datum, List inlinked) throws ScoringFilterException { float adjust = 0.0f; for (int i = 0; i < inlinked.size(); i++) { CrawlDatum linked = (CrawlDatum)inlinked.get(i); adjust += linked.getScore(); } if (old == null) old = datum; datum.setScore(old.getScore() + adjust); }
Example 2
Source File: OPICScoringFilter.java From nutch-htmlunit with Apache License 2.0 | 5 votes |
/** Increase the score by a sum of inlinked scores. */ public void updateDbScore(Text url, CrawlDatum old, CrawlDatum datum, List<CrawlDatum> inlinked) throws ScoringFilterException { float adjust = 0.0f; for (int i = 0; i < inlinked.size(); i++) { CrawlDatum linked = inlinked.get(i); adjust += linked.getScore(); } if (old == null) old = datum; datum.setScore(old.getScore() + adjust); }
Example 3
Source File: OPICScoringFilter.java From anthelion with Apache License 2.0 | 4 votes |
/** Use {@link CrawlDatum#getScore()}. */ public float generatorSortValue(Text url, CrawlDatum datum, float initSort) throws ScoringFilterException { return datum.getScore() * initSort; }
Example 4
Source File: AnthelionScoringFilter.java From anthelion with Apache License 2.0 | 4 votes |
/** * This is the score that is used for selecting the urls that are going to * be fetched. If you didn't know that you will have some headaches. * */ @Override public float generatorSortValue(Text url, CrawlDatum datum, float initSort) throws ScoringFilterException { // TODO Auto-generated method stub return datum.getScore(); }
Example 5
Source File: LinkAnalysisScoringFilter.java From anthelion with Apache License 2.0 | 4 votes |
public float generatorSortValue(Text url, CrawlDatum datum, float initSort) throws ScoringFilterException { return datum.getScore() * initSort; }
Example 6
Source File: LinkAnalysisScoringFilter.java From anthelion with Apache License 2.0 | 4 votes |
public float indexerScore(Text url, NutchDocument doc, CrawlDatum dbDatum, CrawlDatum fetchDatum, Parse parse, Inlinks inlinks, float initScore) throws ScoringFilterException { return (normalizedScore * dbDatum.getScore()); }
Example 7
Source File: OPICScoringFilter.java From nutch-htmlunit with Apache License 2.0 | 4 votes |
/** Use {@link CrawlDatum#getScore()}. */ public float generatorSortValue(Text url, CrawlDatum datum, float initSort) throws ScoringFilterException { return datum.getScore() * initSort; }
Example 8
Source File: LinkAnalysisScoringFilter.java From nutch-htmlunit with Apache License 2.0 | 4 votes |
public float generatorSortValue(Text url, CrawlDatum datum, float initSort) throws ScoringFilterException { return datum.getScore() * initSort; }
Example 9
Source File: LinkAnalysisScoringFilter.java From nutch-htmlunit with Apache License 2.0 | 4 votes |
public float indexerScore(Text url, NutchDocument doc, CrawlDatum dbDatum, CrawlDatum fetchDatum, Parse parse, Inlinks inlinks, float initScore) throws ScoringFilterException { return (normalizedScore * dbDatum.getScore()); }