task

prj-scraper

[previously]

  1. [2013-mm-dd] C# read html into dom
  2. [2013-10-06] > x] prj-screen scraping
    1. ] write tutorial with different methods of screen scraping
    2. ] example - read values(for qualifying results) from mrn page into string
    3. x] research  
  3. [2013-10-07]> i] prj-scraper - 001-research
    1. ] api for racing data VS screen scraping

[currently]

[next]

  1. ] CD tech-dev-www READ ?? xslt ?? && 
    1. ] built into browser, ] use to transform xml into html, 
    2. ] use .this VS using "jq" or "svr side - framework - methods" to do transformation, b/c it is native to browser, 
      1. ] implementation of std across browser platfroms
    3. ] ? use to deserialize objects or ???
  2. ] src = CD bk language javascript
] html screen scraping
ex = mrn = "id=ctl00_cphMain_phExtra_ctl00_seasonStatsGridView"
- use WebClient(), dload string 
- use regex to parse string 
OR
- use library like HTMLAgilityPack (reads string into dom elements)
- has methods to query, 
- uses XSLT, to   
OR 
- directly use XML document class, LINQ to XML
- browsers have XSLT transform 
REVIEW
- example prev "weather pages" 

[reference]

  1. Parsing (X)HTML into a DOM tree in C# 
  2. Sound Code: Screen-Scraping in C# using LINQPad and HTML Agility Pack - 
  3. viziblr - News - Scraping the NHL 2010-2011 Schedule with C#, LINQ, and the HTML Agility Pack
  4. Screen scraping in C# | .NET Slave(mads kristensen) 
  5. The Wall: Screen scraping in C# using WebClient
  6. Html Agility Pack - Download: HAP 1.4.6 - a c# library for parsing html files
  7. 2015-05-08] https://blog.hartleybrody.com/web-scraping/ comments: https://news.ycombinator.com/item?id=4893864 
  8. ] legalities of scraping sites
Details Photos Edit more

Details

ID: 2879

NAME: PRJ-scraper

DESCRIPTION: ] how to read some data off another web page and ...

START DATE TIME: 2013-10-06 21:22:44

EST DURATION: 01:00:00

END DATE TIME: 2013-10-06 22:22:44

STATUS: In-Process

PRIORITY: -5

OWNER ID: 75

Content Photos Edit more

photos

photos for this task

actions

Agenda Email task SMS task Priorities