Skip to content
/ parsel Public
forked from scrapy/parsel

Parsel lets you extract data from XML/HTML documents using XPath or CSS selectors

Notifications You must be signed in to change notification settings

Digenis/parsel

This branch is 551 commits behind scrapy/parsel:master.

Folders and files

NameName
Last commit message
Last commit date

Latest commit

2ca8d8a · Apr 22, 2016
Oct 12, 2015
Apr 22, 2016
Aug 14, 2015
Aug 24, 2015
Aug 14, 2015
Jul 26, 2015
Mar 16, 2016
Aug 24, 2015
Jul 26, 2015
Aug 20, 2015
Oct 8, 2015
Jul 31, 2015
Jul 26, 2015
Jul 31, 2015
Aug 24, 2015
Aug 14, 2015

Repository files navigation

Parsel

Coverage report

Parsel is a library to extract data from HTML and XML using XPath and CSS selectors

Features

  • Extract text using CSS or XPath selectors
  • Regular expression helper methods

Example:

>>> from parsel import Selector
>>> sel = Selector(text=u"""<html>
        <body>
            <h1>Hello, Parsel!</h1>
            <ul>
                <li><a href="http://example.com">Link 1</a></li>
                <li><a href="http://scrapy.org">Link 2</a></li>
            </ul
        </body>
        </html>""")
>>>
>>> sel.css('h1::text').extract_first()
u'Hello, Parsel!'
>>>
>>> sel.css('h1::text').re('\w+')
[u'Hello', u'Parsel']
>>>
>>> for e in sel.css('ul > li'):
        print(e.xpath('.//a/@href').extract_first())
http://example.com
http://scrapy.org

About

Parsel lets you extract data from XML/HTML documents using XPath or CSS selectors

Resources

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Python 95.9%
  • Makefile 4.1%