Get access to structured posts data from news articles, blog posts and online discussions.

wh_news(token, q, ts = (Sys.time() - (3 * 24 * 60 * 60)), sort = NULL,
  order = NULL, accuracy = NULL, highlight = NULL, latest = NULL,
  quiet = !interactive())

Arguments

token

your token as returned by wh_token.

q

a string query containing the filters that define which posts will be returned.

ts

The "ts" (timestamp) parameter is telling the system to return results that were crawled after this timestamp (POSIXct or POSIXlt).

sort

by default (when the sort parameter isn't specified) the results are sorted by the recommended order of crawl date. See details for valid values.

order

If you choose to order the posts by any of the numeric sort values, you can choose in what order you want to get them: asc (default) or desc.

accuracy

return only posts with high extraction accuracy, but removes about 30 the total matching posts (with lower confidence).

highlight

return the fragments in the post that matched the textual Boolean query. The matched keywords will be surrounded by <em/> tags.

latest

this will return the latest 100 crawled posts matching your query (**NOT** recommended).

quiet

if FALSE does not return useful information to console.

Value

object of class webhoser

Details

Valid sort values

  • relevancy

  • social.facebook.likes

  • social.facebook.shares

  • social.facebook.comments

  • social.gplus.shares

  • social.pinterest.shares

  • social.linkedin.shares

  • social.stumbledupon.shares

  • social.vk.shares

  • replies_count

  • participants_count

  • spam_score

  • performance_score

  • published

  • thread.published

  • domain_rank

  • ord_in_thread

  • rating

See official documentation for valid filters.

Examples

# NOT RUN {
token <- wh_token("xXX-x0X0xX0X-00X")

rstats <- wh_news(q = '"Programming language"') %>%  # use quote marks!
  wh_collect() # collect results

wh_news(
    q = paste0(
      '"US President" OR Trump crawled>:',
       as.numeric(Sys.time() - (3 * 24 * 60 * 60))
     )
  ) %>%
  wh_paginate(
    p = 1,
    ts = as.numeric(Sys.time() - (3 * 24 * 60 * 60))
  ) %>%
  wh_collect() -> trump
# }