Metadata-Version: 1.1
Name: parsel
Version: 1.5.2
Summary: Parsel is a library to extract data from HTML and XML using XPath and CSS selectors
Home-page: https://github.com/scrapy/parsel
Author: Scrapy project
Author-email: info@scrapy.org
License: BSD
Description: ===============================
        Parsel
        ===============================
        
        .. image:: https://img.shields.io/travis/scrapy/parsel/master.svg
           :target: https://travis-ci.org/scrapy/parsel
           :alt: Build Status
        
        .. image:: https://img.shields.io/pypi/v/parsel.svg
           :target: https://pypi.python.org/pypi/parsel
           :alt: PyPI Version
        
        .. image:: https://img.shields.io/codecov/c/github/scrapy/parsel/master.svg
           :target: http://codecov.io/github/scrapy/parsel?branch=master
           :alt: Coverage report
        
        
        Parsel is a library to extract data from HTML and XML using XPath and CSS selectors
        
        * Free software: BSD license
        * Documentation: https://parsel.readthedocs.org.
        
        Features
        --------
        
        * Extract text using CSS or XPath selectors
        * Regular expression helper methods
        
        Example::
        
            >>> from parsel import Selector
            >>> sel = Selector(text=u"""<html>
                    <body>
                        <h1>Hello, Parsel!</h1>
                        <ul>
                            <li><a href="http://example.com">Link 1</a></li>
                            <li><a href="http://scrapy.org">Link 2</a></li>
                        </ul>
                    </body>
                    </html>""")
            >>>
            >>> sel.css('h1::text').get()
            'Hello, Parsel!'
            >>>
            >>> sel.css('h1::text').re('\w+')
            ['Hello', 'Parsel']
            >>>
            >>> for e in sel.css('ul > li'):
            ...     print(e.xpath('.//a/@href').get())
            http://example.com
            http://scrapy.org
        
        
        
        
        History
        -------
        
        1.5.2 (2019-08-09)
        ~~~~~~~~~~~~~~~~~~
        
        * ``Selector.remove_namespaces`` received a significant performance improvement
        * The value of ``data`` within the printable representation of a selector
          (``repr(selector)``) now ends in ``...`` when truncated, to make the
          truncation obvious.
        * Minor documentation improvements.
        
        
        1.5.1 (2018-10-25)
        ~~~~~~~~~~~~~~~~~~
        
        * ``has-class`` XPath function handles newlines and other separators
          in class names properly;
        * fixed parsing of HTML documents with null bytes;
        * documentation improvements;
        * Python 3.7 tests are run on CI; other test improvements.
        
        
        1.5.0 (2018-07-04)
        ~~~~~~~~~~~~~~~~~~
        
        * New ``Selector.attrib`` and ``SelectorList.attrib`` properties which make
          it easier to get attributes of HTML elements.
        * CSS selectors became faster: compilation results are cached
          (LRU cache is used for ``css2xpath``), so there is
          less overhead when the same CSS expression is used several times.
        * ``.get()`` and ``.getall()`` selector methods are documented and recommended
          over ``.extract_first()`` and ``.extract()``.
        * Various documentation tweaks and improvements.
        
        One more change is that ``.extract()`` and  ``.extract_first()`` methods
        are now implemented using ``.get()`` and ``.getall()``, not the other
        way around, and instead of calling ``Selector.extract`` all other methods
        now call ``Selector.get`` internally. It can be **backwards incompatible**
        in case of custom Selector subclasses which override ``Selector.extract``
        without doing the same for ``Selector.get``. If you have such Selector
        subclass, make sure ``get`` method is also overridden. For example, this::
        
            class MySelector(parsel.Selector):
                def extract(self):
                    return super().extract() + " foo"
        
        should be changed to this::
        
            class MySelector(parsel.Selector):
                def get(self):
                    return super().get() + " foo"
                extract = get
        
        
        1.4.0 (2018-02-08)
        ~~~~~~~~~~~~~~~~~~
        
        * ``Selector`` and ``SelectorList`` can't be pickled because
          pickling/unpickling doesn't work for ``lxml.html.HtmlElement``;
          parsel now raises TypeError explicitly instead of allowing pickle to
          silently produce wrong output. This is technically backwards-incompatible
          if you're using Python < 3.6.
        
        
        1.3.1 (2017-12-28)
        ~~~~~~~~~~~~~~~~~~
        
        * Fix artifact uploads to pypi.
        
        
        1.3.0 (2017-12-28)
        ~~~~~~~~~~~~~~~~~~
        
        * ``has-class`` XPath extension function;
        * ``parsel.xpathfuncs.set_xpathfunc`` is a simplified way to register
          XPath extensions;
        * ``Selector.remove_namespaces`` now removes namespace declarations;
        * Python 3.3 support is dropped;
        * ``make htmlview`` command for easier Parsel docs development.
        * CI: PyPy installation is fixed; parsel now runs tests for PyPy3 as well.
        
        
        1.2.0 (2017-05-17)
        ~~~~~~~~~~~~~~~~~~
        
        * Add ``SelectorList.get`` and ``SelectorList.getall``
          methods as aliases for ``SelectorList.extract_first``
          and ``SelectorList.extract`` respectively
        * Add default value parameter to ``SelectorList.re_first`` method
        * Add ``Selector.re_first`` method
        * Add ``replace_entities`` argument on ``.re()`` and ``.re_first()``
          to turn off replacing of character entity references
        * Bug fix: detect ``None`` result from lxml parsing and fallback with an empty document
        * Rearrange XML/HTML examples in the selectors usage docs
        * Travis CI:
        
          * Test against Python 3.6
          * Test against PyPy using "Portable PyPy for Linux" distribution
        
        
        1.1.0 (2016-11-22)
        ~~~~~~~~~~~~~~~~~~
        
        * Change default HTML parser to `lxml.html.HTMLParser <http://lxml.de/api/lxml.html.HTMLParser-class.html>`_,
          which makes easier to use some HTML specific features
        * Add css2xpath function to translate CSS to XPath
        * Add support for ad-hoc namespaces declarations
        * Add support for XPath variables
        * Documentation improvements and updates
        
        
        1.0.3 (2016-07-29)
        ~~~~~~~~~~~~~~~~~~
        
        * Add BSD-3-Clause license file
        * Re-enable PyPy tests
        * Integrate py.test runs with setuptools (needed for Debian packaging)
        * Changelog is now called ``NEWS``
        
        
        1.0.2 (2016-04-26)
        ~~~~~~~~~~~~~~~~~~
        
        * Fix bug in exception handling causing original traceback to be lost
        * Added docstrings and other doc fixes
        
        
        1.0.1 (2015-08-24)
        ~~~~~~~~~~~~~~~~~~
        
        * Updated PyPI classifiers
        * Added docstrings for csstranslator module and other doc fixes
        
        
        1.0.0 (2015-08-22)
        ~~~~~~~~~~~~~~~~~~
        
        * Documentation fixes
        
        
        0.9.6 (2015-08-14)
        ~~~~~~~~~~~~~~~~~~
        
        * Updated documentation
        * Extended test coverage
        
        
        0.9.5 (2015-08-11)
        ~~~~~~~~~~~~~~~~~~
        
        * Support for extending SelectorList
        
        
        0.9.4 (2015-08-10)
        ~~~~~~~~~~~~~~~~~~
        
        * Try workaround for travis-ci/dpl#253
        
        
        0.9.3 (2015-08-07)
        ~~~~~~~~~~~~~~~~~~
        
        * Add base_url argument
        
        
        0.9.2 (2015-08-07)
        ~~~~~~~~~~~~~~~~~~
        
        * Rename module unified -> selector and promoted root attribute
        * Add create_root_node function
        
        
        0.9.1 (2015-08-04)
        ~~~~~~~~~~~~~~~~~~
        
        * Setup Sphinx build and docs structure
        * Build universal wheels
        * Rename some leftovers from package extraction
        
        
        0.9.0 (2015-07-30)
        ~~~~~~~~~~~~~~~~~~
        
        * First release on PyPI.
        
Keywords: parsel
Platform: UNKNOWN
Classifier: Development Status :: 5 - Production/Stable
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: BSD License
Classifier: Natural Language :: English
Classifier: Topic :: Text Processing :: Markup
Classifier: Topic :: Text Processing :: Markup :: HTML
Classifier: Topic :: Text Processing :: Markup :: XML
Classifier: Programming Language :: Python :: 2
Classifier: Programming Language :: Python :: 2.7
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.4
Classifier: Programming Language :: Python :: 3.5
Classifier: Programming Language :: Python :: 3.6
Classifier: Programming Language :: Python :: 3.7
Classifier: Programming Language :: Python :: Implementation :: CPython
Classifier: Programming Language :: Python :: Implementation :: PyPy
