How to download the export.json file for an instance (e.g., using curl or programmatically)?

bugbuster · December 7, 2019, 5:04pm

If logged in from a browser, I can download this file with https://write.as/me/export.json. That gives me a file of all my blogs and anonymous posts. It can also be done for writefreely instances.

But I’d like to do this from the command line on a remote linux server. Can this file be downloaded via curl or programmatically? I tried using the API with the access token via the curl command but that doesn’t work. Or could it be done from the CLI version of write.as/writefreely? (I didn’t see that option in the docs)

I would like to do an unattended download periodically via a linux shell script or, if necessary, a program (e.g., php or python). If the CLI version can do this download, that would also work for my purposes.

Any ideas on whether this is possible to do?

robjloranger · December 7, 2019, 11:30pm

If your looking for JSON you can do this with the API. Checkout https://developers.write.as/docs/api/#retrieve-user-39-s-posts this should return a JSON array of all posts, anonymous and otherwise. I’m not certain if it gets pinned posts however. You would likely also want to pull collections as well

bugbuster · December 8, 2019, 5:47am

Thanks for that reference, Rob. I can make use of that in lieu of some scripted way to download the export.json file for write.as and writefreely instances. I already had some code written to use export.json but I can adapt it.

cjeller1592 · December 9, 2019, 8:20pm

Hi @bugbuster, there currently isn’t a way to download the file via curl or programatically. I’ve tried both ways (with curl and a Python script) to no success. @robjloranger’s suggestion is what I would’ve recommended as the closest thing to the export.json file. We’ll make sure to make this a feature request so you can follow its progress.

bugbuster · December 9, 2019, 10:17pm

Great! Thank you cj, I appreciate that.

robjloranger · December 10, 2019, 3:39pm

I can write a Python script if that’s of interest, that will save a JSON file

bugbuster · December 10, 2019, 6:32pm

Thanks for the follow-up offer, Rob, but I don’t think I need to ask for that unless it’s a real quickie that you’re interested in doing. Here’s my use case and the current state of my PHP program to use this data.

Use Case: I frequently query my wa/wf posts (anon and blog) for specified keywords, but I don’t wish to do that against the live data, as I’m concerned that may put an unnecessarily load on the server. So I periodically manually download the export.json file and use a PHP program to filter it to output just the keyword matched lines. So my queries are always done against the saved export files and I’m not tying up server resources.

The PHP program is currently coded to use the export.json file and it works great for my purposes. Now I’d like to replace the currently manual download with a periodic automated one. CJ confirms what I found in that the export.json file is not downloadable via curl or the API. I could use the API to alternatively get all my posts as you’ve suggested, and have successfully tested an automated download for that using curl with an access token. However, the json format is different from that of the export.json file, so I need to reprogram the PHP script to query the API-generated data. I plan to do that as I have time, but until that’s done I’ll just manually download the export.json file.

CJ mentioned a feature request for downloading the export.json file without having to do it manually from the /me page while logged in, and I think that would be of use to some others besides just me. Maybe your mention of a Python program to do this is how such a feature request could be implemented? But in the meantime, I’ll just manually download the files and work on the program revision to use the API data as I have time.

Thanks to you and CJ for your responses.

robjloranger · December 10, 2019, 6:59pm

Ok, if I write one anyway anytime soon I will share here. I think the feature on writefreely would be easy enough to implement, if it’s something @matt is interested in I can do it soon.

bugbuster · December 12, 2019, 5:48am

Ok, I’ve created an example PHP search program that queries the JSON data downloaded from the API that includes all the posts (anonymous or blog) for a user’s Write.as or WriteFreely instance(s). I wrote a blog post explaining the program and included the code, in case anyone is interested in it.

Example program to search Write.as/WriteFreely posts from API JSON data

cjeller1592 · December 12, 2019, 7:29pm

Thanks for sharing the code @bugbuster! I am trying to recreate the PHP search program on Glitch with an old export.json file of mine here. It doesn’t seem to want to search anything though. What do you think it could be? I am not too familiar with PHP.

Maybe you could recreate it there better than I could! There are a couple of starter PHP projects you could use.

bugbuster · December 12, 2019, 9:16pm

I think there are two things to do:

First, you are using JSON data from the logged in export JSON feature on the Log in — Write.as page. Instead, you need to use the JSON data generated by the API as instructed in the blog article and in the PHP program. Replace the export JSON with the API JSON and it should work ok. The file formats are different between the two.
Second, that glitch remix you are using is unnecessarily complex for this purpose. I suggest you do a remix of my wawf-api-data-search (Glitch: The friendly community where everyone builds the web), which is a simplified remix of yours, including the export JSON data. Substitute the API JSON data for that export JSON data and it should work then.

cjeller1592 · December 12, 2019, 10:06pm

Thanks @bugbuster! I followed your instructions and the app now works like a charm.

This is an amazing search app by the way. It’s amazingly quick and I love having the ability to do Boolean searches and to see where each result is coming from (Anonymous, blog, etc). I might make this my go to for searching all of my Write.as/WriteFreely posts.

Would love to promote this more and make it easier for others to grab the JSON so that they too can take advantage of this useful app you’ve built. Well done!

bugbuster · December 12, 2019, 10:25pm

Great! Glad to hear it’s working on Glitch. Yeah, performance is so much better when the data files are local to the server rather than having to pull them off the web one by one. And here the user’s posts are all in just one JSON file so there’s no multiple http requests for each separate post, which is slow and also puts an extra burden on the server.

Seeing your output reminds me of a code tweak I need to do. There are several posts of yours where there is no title for the post and thus there’s no clickable link for it in the output. The reason (at least one reason I know of) why the title is missing is that the first line of the post does not begin with # and a space. It appears to me that is how the API identifies what the title is. If no # and space on the first line, the title is blank in the API output. After doing the query, you know you have a keyword match, but you have no idea what the post is about due to the missing title.

BUT, there’s a workaround for that. If the title is blank, then the id can be used instead to provide a clickable link. So you won’t get a convenient text title but at least there’ll be a clickable link to navigate to the post so you can see what it is. Then if you wish, you can add a title line to the post. I’ll work on that as I have time and edit the blog article and also post the update here so you can edit your glitch app accordingly.

cjeller1592 · December 12, 2019, 10:52pm

Yeah the JSON file saves so much time and effort.

As for the untitled posts in search results, I’ve had success making the title of a post “Untitled” instead of using the ID. I don’t know if it’s quicker per se but it makes it easier for me to read.

bugbuster · December 12, 2019, 11:19pm

@cjeller1592 Done! Code update: if title is blank but the slug is populated, the slug is substituted for the title as it may contain useful identifying info. But if both title and slug are blank, then the post id is used, which is just random characters but at least it results in a clickable link.

I’ve updated the blog post for this tweak. The update is available at Gossamer Gojirasaurus Lines 139 - 146 of index.php contain the update. Everything else remains the same.

cjeller1592 · December 13, 2019, 9:44pm

Excellent, I just added it to my remix of your search app.

I also created a companion app so that anyone can grab the output of their posts without having to mess with the API, CURL or access tokens (here it is). All you need to do is remix @bugbuster’s app first, log in with your Write.as credentials on the companion app, copy the json it spits out, and paste it into the wa-posts.json file of your remixed search app. And to be safe, the companion app logs you out at the end, making the authentication token generated when logging in useless except for getting the json.

My hope is that this will make using the search app easier, especially for grabbing an updated list of all your posts to search from.

bugbuster · December 15, 2019, 9:43pm

For those who might want to try this: a couple points about how glitch.com works regarding the public vs private visibility of the JSON data files in the glitch app. I added these comments to the app’s README.md.

Note that unless you make your project private, your JSON files will be copied over to anyone who remixes your project. See this glitch help topic for more info and instructions on how to make your project private if you wish to do so. Your blog is probably public anyway, but you have the choice whether to keep these JSON files private or not.

If you want to keep your project public, but don’t want your JSON data files to be available to others who remix your project, you can put the data files in a special .data folder, which is not copied over or visible to others when a project is remixed.

If you put your JSON files in .data, you must reference them accordingly in the index.php program. Here is an example:

$blog[0] = array("fn" => ".data/wa-posts.json", "href" => "https://write.as/", "md_ext" => '.md');

As an example, this project has the JSON data file stored in the .data folder. If you remix this app, you’ll need to create the .data folder and add the JSON file(s) to it per the glitch help message referenced above.

underlap · January 24, 2025, 10:54am

Was a feature request ever created? If so, please could you link to it? Thanks.