For my inaugural Power Query post, I’ll show how you can get data from the Twitter API, specifically the Twitter GET search/tweets method.
The web API is one of my favorite data sources in Power Query, because it opens a new world of data to Excel users. There are literally thousands, and perhaps tens of thousands of APIs out there. Some examples include Data.gov, United States Census, Weather Underground, and Twitter.
The Twitter API does require authentication, so you will need a Twitter account. Once you have an account, here are the steps:
- Create a Twitter application here – https://apps.twitter.com/
- On the Permissions tab, I set mine to Read only. The other permission levels allow the application to author Tweets on your behalf or even access direct messages, so be careful.
- If you’re only using the application for personal data retrieval, the other settings are not that important.
- On the Power Query Options menu in Excel, select Privacy and then select “Ignore the Privacy levels and potentially improve performance”. This is required because a single query is combing multiple data sources (One to get a token and one to get the data).
- Copy the code below into the Power Query Advanced Editor.
- Enter the Consumer Key and Consumer Secret from your Twitter application on line 13. It should look like this: “xxxxxxxxxxxxxxxxxxxxxxxxx:xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx”. IMPORTANT – the application keys should be treated as passwords. In some scenarios you can use the ApiKeyName option of the Web.Contents formula to store the key securely in the credential store. Unfortunately I was not able to get it to working in this case. I believe it is incompatible with the Content option, which makes the request a POST. See Chris Webb’s post about using the ApiKeyName option.
- Enter your query on line 28. My sample code queries for Allegient, which is my employer’s name (Shameless plug).
- Close and Load the query!
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 |
/* This M script gets an bearer token and performs a tweet search from the Twitter REST API https://dev.twitter.com/oauth/application-only Requires establishing a Twitter application in order to obtain a Consumer Key & Consumer Secret https://apps.twitter.com/ IMPORTANT - The Consumer Key and Consumer secret should be treated as passwords and not distributed */ let // Concatenates the Consumer Key & Consumer Secret and converts to base64 authKey = "Basic " & Binary.ToText(Text.ToBinary("<ConsumerKey>:<ConsumerSecret>"),0), url = "https://api.twitter.com/oauth2/token", // Uses the Twitter POST oauth2/token method to obtain a bearer token GetJson = Web.Contents(url, [ Headers = [#"Authorization"=authKey, #"Content-Type"="application/x-www-form-urlencoded;charset=UTF-8"], Content = Text.ToBinary("grant_type=client_credentials") ] ), FormatAsJson = Json.Document(GetJson), // Gets token from the Json response AccessToken = FormatAsJson[access_token], AccessTokenHeader = "bearer " & AccessToken, // Uses the Twitter GET search/tweets method using the bearer token from the previous POST oauth2/token method GetJsonQuery = Web.Contents("https://api.twitter.com/1.1/search/tweets.json?q=Power BI&count=100", [ Headers = [#"Authorization"=AccessTokenHeader] ] ), FormatAsJsonQuery = Json.Document(GetJsonQuery), NavigateToStatuses = FormatAsJsonQuery[statuses], TableFromList = Table.FromList(NavigateToStatuses, Splitter.SplitByNothing(), null, null, ExtraValues.Error), ExpandColumn = Table.ExpandRecordColumn(TableFromList, "Column1", {"metadata", "created_at", "id", "id_str", "text", "source", "truncated", "in_reply_to_status_id", "in_reply_to_status_id_str", "in_reply_to_user_id", "in_reply_to_user_id_str", "in_reply_to_screen_name", "user", "geo", "coordinates", "place", "contributors", "is_quote_status", "retweet_count", "favorite_count", "entities", "favorited", "retweeted", "lang", "possibly_sensitive", "quoted_status_id", "quoted_status_id_str", "quoted_status"}, {"Column1.metadata", "Column1.created_at", "Column1.id", "Column1.id_str", "Column1.text", "Column1.source", "Column1.truncated", "Column1.in_reply_to_status_id", "Column1.in_reply_to_status_id_str", "Column1.in_reply_to_user_id", "Column1.in_reply_to_user_id_str", "Column1.in_reply_to_screen_name", "Column1.user", "Column1.geo", "Column1.coordinates", "Column1.place", "Column1.contributors", "Column1.is_quote_status", "Column1.retweet_count", "Column1.favorite_count", "Column1.entities", "Column1.favorited", "Column1.retweeted", "Column1.lang", "Column1.possibly_sensitive", "Column1.quoted_status_id", "Column1.quoted_status_id_str", "Column1.quoted_status"}) in ExpandColumn |
Notes:
- Twitter limits you to only 100 rows at a time, so don’t be surprised by the limited results.
- This pattern should work for other Twitter APIs. Just change the URL on line 28 and the column names as appropriate.
- Thanks to Matt Masson on the Power Query team for his efforts involving syntax highlighting in WordPress.
- Kasper de Jonge has also written about using Power Query with the Twitter API . I believe I’ve improved upon it by automating the retrieval of the token.
Thanks for the post! Tried it out on PowerBI Designer and it was great to see the tweets coming in. The one thing that is confusing me is that when using the #SQLCruise hashtag it only downloaded 12 tweets. When I use TweetDeck I get way more than 12. Any ideas on where to look
Hey Howard, I’m glad to hear that you got it working! I figured out what the discrepancy is between this Power Query example and TweetDeck. TweetDeck is using the GET users/search method instead of the GET search/tweets method, hence the different results.
The bad news is that the GET users/tweets method requires a different authentication mechanism with the Twitter API that involves user context (And greater difficulty). My example above uses “Application-only authentication”.
Hopefully I can find time to try and get that the other authentication mechanism working in Power Query, as it opens up even more options. In theory I believe you could post Tweets from Excel/Power Query 🙂
Also – great name btw!
Hi Chris,
Thanks for your wonderful post this is really helpful. I have been mainly MS SQL Server Developer. First time working with API and Rest API. Currently struggling to get Cherwell Rest API into PowerBI. It required web search to call Rest API .Which is not my strength
Hi Chris, this was super helpful in accessing the search API, but do you know why it might not authenticate when using “https://api.twitter.com/1.1/statuses/user_timeline.json?” or similar GETs? I’m looking to set up queries to pull from these and it looks like the only other method is to change my authentification timestamp, nonce, etc. each time I need to refresh. Thanks for this, this has been very helpful!
Bo
Hey Bo, I’m glad that you found the post useful. The user timeline method appears to be available using the Application-only authentication method described in this post, but I have not tried that one personally. What is the exact error that you’re seeing?
Thanks for the reply, Chris! I’m getting “The user was not authorized”, and it won’t authenticate on anonymous or anything.
Thanks for the help Chris – got it working yesterday. And then today I saw the info on the new Office Graph APIs, and wondered if a similar mechanism would work with that too. My skill level was too low to find out, but wondered if you’re looking at it with the same question?
https://graph.microsoft.io/
(Scenarios I’m thinking about are things like the “Get People I work with” feed which would be interesting for scenarios with students and researchers in universities)
Hi Ray, I would imagine that the process would be very similar. I have not done any work with the Microsoft Graph, however, so I can’t say for certain. I don’t have sufficient privileges to register an Office 365 app, but there is a new app registration portal in preview that I may explore as I have time.
FYI the Graph APIs are available as an OData feed, so provided your org isn’t too large, this should be a straightforward way to pull the data into Power BI
Thank You so much for this help. its the only way we can fetch data in excel. can i use this code in the SQL part in rapidminer 5.3??
Hello Sounak. I’m glad you found the post helpful. I am not familiar with Rapidminer, so unfortunately I cannot offer any help specifically with that product. If Rapidminer is able to work with REST APIs, then you could likely port it. You could also perhaps use another tool such as PowerShell or SSIS to obtain the data and store it in text files or SQL Server for further analysis.
Hi Chris,
Great article, it looks very promising.
I filled in my ConsumerKey and ConsumerSecret, the syntax check is OK, but I get the following error message at the GetJson step: “Please specify how to connect…”
Any suggestions are very welcome!
Thanks, Edgar
Hi Edgar, I don’t recall seeing that prompt when developing this query. Does the message appear in the PowerQuery data window? Or is there a popup? I ask because I’m wondering if the prompt is coming from PowerQuery or from the Twitter API.
This is what my screen looks like on the GetJson step – link.
HI Chris, The message appears in a yellow bar at the top of the main pane inside the Query Editor. I do not know how to send you a screenshot.
Thanks, Edgar
In the mouse-over popup message I get more info:
“Permission Error: Credentials are required to connect to the Web source. You’ll need to refresh and provide credentials to continue. (Source at https://api.twitter.com/1.1/search/tweets.json.)”
Nice article, greatly appreciated. I wanted to ask if you were able to find a solution for pulling Twitter Analytics (uses the Twitter AD API) with Power Query. The problem is that Twitter AD API does not work with OAuth2.0 that you use in this example.
Hi James, I have not worked with the Twitter Analytics or Ads API at all. I looked over the documentation, and it appears that the token for these requests must represent the “current user”. This authentication method is more difficult than the “Application-Only” authentication in my example above, but I think it’s still possible with Power Query. Part of this authentication request involves generating an oauth nonce, which is essentially a random alphanumeric string. I’ve developed some M code that generates a random alphanumeric string, and I’ll be posting it soon. It’s not a complete solution for the Analytics API, but it’s one piece of the puzzle.
Chris, yes that authentication is more challenging. It would be easier working with a web application because the POST requires an actual URL for the redirect for a user to login thereby confirming the “current user” you mention above. The issue here is that with Power Query, I don’t find think a solution requiring browser redirect from Power Query is a feasible solution. All other major social media outlets work with OAUTH2. Not sure why Twitter took this path.
Hello, i got the error: Expression.Error: The ‘Authorization’ header is only supported when connecting anonymously. These headers can be used with all authentication types: Accept, Accept-Charset, Accept-Language, Cache-Control, If-Modified-Since, Referer
Do you know why?
Phylipe, doing a HTTP POST in Power Query is only supported when using anonymous authentication. Otherwise you would have redundant or conflicting credentials. When you first run the query in this blog post, Power Query should ask you how you want to connect. This is where you select the anonymous option. If you need to change the credentials, you can do so via the Data Source Setting menu in the Excel/Power Query menu.
Thanks very much for this explanation Chris, now it works for me….
This sample saved my day – but for a different data source. I was struggling to pull data from our cloud based ticket system provided by Zendesk – but some minor tweaks in the query from the article and my face was one big smile.
That’s great to hear! Thanks for the feedback!
Thank you for this awesome functionality. I was wondering if you could provide some input on the following: When I publish a dashboard that pulls in tweets as you have described here, it breaks the refresh ability. If I try to manually refresh the data set in Power BI service, it says that my data source credentials are invalid. The errors are asking me to edit my credentials for:
https://api.twitter.com/oauth2/token
https://api.twitter.com/1.1/search/tweets.json?q=%5BMy Query]&count=100
Any thoughts?
David, I have experienced a similar error with a REST call in Power BI. Some have claimed to have solved this by using the RelativePath argument with Web.Contents. Unfortunately I have not had a chance to try this myself. The forum post is here. Chris Webb has written a blog post about RelativePath that is worth reading as well.
Hello, did you solve this problem? I currently have this problem. Please help
Hey Chris, awesome post. I am in the field of architecture and just gradually getting in to data science. So this is new terrain to me. Is there a way to use the same logic to get search data from instagram?
Thanks.
Thank you! Yes, Instagram does have an API. I have not had an opportunity to explore it, myself. Here’s a link – Instagram API.
Hi, Chris, i changed code for OAuth to:
authKey = “Basic ” & Binary.ToText(Text.ToBinary(“:”),0),
url = “https://api.twitter.com”,
GetJson = Web.Contents(url,
[
Headers = [#”Authorization”=authKey,
#”Content-Type”=”application/x-www-form-urlencoded;charset=UTF-8″],
),
But i responopsed the same error ” “Please specify how to connect…”.
What i did wrong? Thank you!
Something strange with code, sorry.
DataSource.Error: Web.Contents with the Content option is only supported when connecting anonymously.
Details:
DataSourceKind=Web
DataSourcePath=https://api.twitter.com/oauth2/token
I’m getting the same error how can correct it????
Make sure that the Power Query connections for both the token and the Search Tweets URLs are set to use Anonymous access. You can change the credentials in the Data Source Setting menu in the Excel/Power Query menu.
In Data Source Settings, select the data source, then Edit Permissions. Then under Credentials, select Edit and change to Anonymous.
Is there any way to get more than 100 tweets? I’ve seen other apps without that restriction.
The GET search/tweets method is limited to 100 results. However, you can perhaps query it multiple times, using query parameters based on the results to get additional sets. Here’s one post on StackOverflow that describes how to do this. I have not tried it myself.
http://stackoverflow.com/questions/17887984/is-it-possible-to-get-more-than-100-tweets
Is this still a limitation and this is the only workaround ?
Yes, this method still has a maximum of 100 results per query. Twitter has an article called Working with Timelines that discusses how to page through results.
Can you help how can i get more than 200 rows ??i m not getting how to work with the suggestion at stack overflow whch you mention above
Getting more than 100 rows requires using a technique called cursoring. You would use a query similar to the one in this post, but you supply additional parameters to specify the range of data to return. I think this is possible with Power Query, but I have not tried it myself, and it was outside the scope of this post.
Twitter has an article called Working with Timelines that discusses this technique.
Hi,
I have a POST web method to get token and data as shown below.
To get token –> used Postman service POST method :API link and in the body segment used the below method call to get the token.
{
“method”:”Authentication::login”,
“args”:{
“email”:”emailadress”,
“password”:”XXXXXX”
}
}
Can I do the same in M query ? To get data, a different method is called using the token received above.
Thanks,
Paddu
Hi,
I’ve just this post for setting up a conection with filemaker.
The only step I’m missing is to get the response token as text.
It says in dutch: Expression.Error: Kan operator & niet toepassen op typen Text en Record.
When I make the expression: “bearer “& AccessToken.
How can i get AccessToken as tekst.
Regards Leo
Hello Chris,
I got this working today, but my question is about the completeness of the tweets – it seems not all of the text is retrieved – do you know of this problem and any solution for it?
Please see the notes section above. The Twitter API returns 100 rows at a time, so multiple API calls are required to get more results. The technique for doing so is beyond the scope of this post.
Hello Chris,
Is there a way to extract last 7 days or for that matter any specific date range tweets using JSON?
Hi, for some reasons I am unable to get all the tweets even when the count =100 and I have about 118 tweets in total.
Hey Chris, I would like to thank you to have posted that. It helped me a lot. Cheers
Hi Chris, thanks for this great tutorial! I keep getting a (400) bad request error and it says failed to get contents (from the search url). Any idea what I might be doing wrong?
Hi Chris, I came here from Curbal.com to see if I could make this work. I think Twitter have amended their authorization methods or something on the new developers.twitter.com interface. Anyway, your code didn’t work for me. An error message was that PowerBI was expecting json to be sent back but html was sent instead – something like that anyway. It would be great if you could update for Curbal and for the rest of the world! But maybe it’s impossible now? I don’t know.
Hi Chris, Your blog post has been really helpful. Thank you for sharing this wonderful tutorial. I did this and its perfect. !!
Thank you for the post. It helped a lot!