-
Notifications
You must be signed in to change notification settings - Fork 9
Installation & Instantiation
Install package using NPM or Yarn.
npm i twitter-archive-reader
or
yarn add twitter-archive-reader
This package internally use JSZip to read ZIP archives, you can load archives in this module the same way you load them in JSZip.
// ESModules
import TwitterArchive from 'twitter-archive-reader';
// CommonJS
const { TwitterArchive } = require('twitter-archive-reader');
You can create an instance with several types of objects, all of them must reference an archive.
Once you've created the instance, you must wait for the ready-ness status of the object with the .ready()
promise.
Supported loading methods:
-
File
from browser -
Buffer
orArrayBuffer
-
string
(filename) for loading files in Node.js -
number[]
orUint8Array
, bytes arrays -
JSZip
instances -
Archive
instances (seeStreamZip.ts/Archive
class)
// You can create TwitterArchive with all supported
// formats by JSZip's loadAsync() method
// (see https://stuk.github.io/jszip/documentation/api_jszip/load_async.html).
// You can also use filename or node Buffer.
// By a filename
const archive = new TwitterArchive('filename.zip');
// By a file input (File object)
const archive = new TwitterArchive(document.querySelector('input[type="file"]').files[0]);
// Initialization can be long (unzipping, tweets & DMs reading...)
// So archive supports events, you can listen for initialization steps
archive.events.on('zipready', () => {
// ZIP is unzipped
});
archive.events.on('tweetsread', () => {
// Tweet files has been read
});
// See all available listeners in Events section.
console.log("Reading archive...");
// You must wait for ZIP reading and archive object build
await archive.ready();
// Archive is ready !
You can set options when you load the TwitterArchive
instance.
Available options are:
new TwitterArchive(
/**
* Archive to load.
* Can be a string (filename), number[], Uint8Array,
* JSZip, Archive, ArrayBuffer and File objects.
*
* If you want to build an archive instance **without** a file, you can pass `null` here.
* You must then load parts of the archive with `.loadArchivePart()` or `.loadClassicArchivePart()` !
*/
file: AcceptedZipSources,
options: TwitterArchiveLoadOptions = {
/**
* Specify if you want to ignore a specific part of archive, for performance or memory reasons.
*
* Available parts are in `ArchiveReadPart` type.
*
* By default, all parts are imported from archive.
* If you want to ignore every part, you can specify `"*"` in the part array.
*
* **Profile and account data is always parsed.**
*
* ```ts
* type ArchiveReadPart = "tweet" | "dm" | "follower" | "following" | "mute" | "block" | "favorite" | "list" | "moment" | "ad";
* ```
*
* To manually load a part after archive has been loaded, use `.initArchivePart()` method.
* Please don't initialize a part twice, it could lead to vicious bugs !
*/
ignore?: (ArchiveReadPart | "*")[],
}
)
Since 6.0.0
, you can control which part of archive is loaded during initial archive load, then decide which part to read.
By default, every part of archive is fully loaded and constructed into data structures.
You can ignore specific parts with the options.ignore
parameter of TwitterArchive
constructor.
Available parts are defined in ArchiveReadPart
type.
type ArchiveReadPart = "tweet" | "dm" | "follower" | "following" | "mute" | "block" | "favorite" | "list" | "moment" | "ad";
Note: User informations are always fully loaded and parsed if available in archive. Those kind of information is light and should not cause any problem.
// Instanciate without direct messages and ads
const archive = new TwitterArchive('filename.zip', {
ignore: ['ad', 'dm']
});
You can initialize a instance without loading any data, except user informations.
Just specify '*'
in options.ignore
array.
const archive = new TwitterArchive('filename.zip', {
ignore: ['*']
});
If you skip parts in archive initialization, you can manually parse them with the .initArchivePart()
method.
Each parameter of this method is a ArchiveReadPart
.
Take care of not loading a part twice ! It could lead to unexpected side effects.
// Skip everything except the tweets
const archive = new TwitterArchive('filename.zip', {
ignore: ['dm', 'ad', 'follower', 'following', 'block', 'favorite', 'list', 'moment']
});
await archive.ready();
// ...
// Later
// We want access to direct messages and ad data now
await archive.initArchivePart("dm", "ad");
// Ready to access them !
Archive is quite long to read: You have to unzip, read tweets, read user informations, direct messages, and some other informations... So you might want to display current loading step to the end-user.
The TwitterArchive
provides a event system compatible driven by the events
package (native Node.js events).
The event emitter is available at the .events
property of the TwitterArchive
object.
You could listen to events with .events.on()
method, and remove listener(s) with .events.off()
.
Events are listed in their order of apparition.
Any of the described events, except error
, contain elements in it (in detail
attribute).
Fires when archive is unzipped (its content has not been read yet !).
Fires when basic user informations (archive creation date, user details) has been read.
Fires when tweet index (months, tweet number) has been read.
Fires when tweet files has been read.
Fires when direct messages files are about to be read. This event does not fire when a classic archive is given.
Fires when misc infos (favorites, moments...) are about to be read. This event does not fire when a classic archive is given.
Fires when every listen event from now that happens is fired.
archive.events.on('read', ({ step }) => {
console.log("Archive is at read step", step);
});
Fires when the reading process is over.
Linked to .ready()
promise (fulfilled).
Fires when read fails.
Contain, in the detail
attribute, the throwed error.
Linked to .ready()
promise (rejected).
Next part is Archive Properties.
- Direct Messages