Web Extractor

Delivery & Output

The Web Extractor is designed for integration. It can sit locally next to your data repository or remotely on a server hosted by 30 Digits. Because of this, it delivers the information is multiple ways and in multiple formats.

 

Direct Integration – Push

Should you wish to integrate it with your local system or have the appropriate paths open, it can push this data directly into:

  • Search Engines
    • Autonomy, FAST, Lucene/Solr, …
  • Databases
    • Oracle, MySQL, Postgres, MS-SQL, …
  • XML Platforms
    • MarkLogic, eXist, …
  • Document Management Systems
    • Alfresco, Documentum, …
  • Content Management Systems
    • Typo3, LifeRay, SiteFusion, …

This is just a sampling as most systems support either an XML input or Database push. Custom input methods can also be added on request. The information can also be encrypted for transmission.

 

Loose Integration – Pull

Writing the files to disk can be done as well or simultaneously with the push. With writing the data to disk, they can be pulled be any method of your choosing. Common methods for pulling are:

  • eMail
  • FTP
  • RSS
  • JSON
  • WebDAV
 

There are also multiple formats in which the data can be delivered. If you would like to know more about that, follow this link to our page on formats.

If you would like more details, on how to integrate the Web Extractor with your system or receive data, contact us directly or fill out the form to the left, and we will get back to you soon.