Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The env itself added #2

Merged
merged 2 commits into from
Sep 27, 2017
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 9 additions & 0 deletions .babelrc
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
{
"presets": [
"latest",
"stage-0"
],
"plugins": [
"transform-runtime"
]
}
6 changes: 6 additions & 0 deletions .dockerignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
*
!lib
!index.js
!package.json
!docker
!vendor
14 changes: 14 additions & 0 deletions .eslintrc
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
{
"parser": "babel-eslint",
"extends": "airbnb-base",
"env": {
"browser": false,
"node": true,
"es6": true
},
"rules": {
"max-len": [2, 120],
"no-param-reassign": [2, {"props": false}],
"prefer-template": [0]
}
}
61 changes: 2 additions & 59 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1,59 +1,2 @@
# Logs
logs
*.log
npm-debug.log*
yarn-debug.log*
yarn-error.log*

# Runtime data
pids
*.pid
*.seed
*.pid.lock

# Directory for instrumented libs generated by jscoverage/JSCover
lib-cov

# Coverage directory used by tools like istanbul
coverage

# nyc test coverage
.nyc_output

# Grunt intermediate storage (http://gruntjs.com/creating-plugins#storing-task-files)
.grunt

# Bower dependency directory (https://bower.io/)
bower_components

# node-waf configuration
.lock-wscript

# Compiled binary addons (http://nodejs.org/api/addons.html)
build/Release

# Dependency directories
node_modules/
jspm_packages/

# Typescript v1 declaration files
typings/

# Optional npm cache directory
.npm

# Optional eslint cache
.eslintcache

# Optional REPL history
.node_repl_history

# Output of 'npm pack'
*.tgz

# Yarn Integrity file
.yarn-integrity

# dotenv environment variables file
.env

node_modules
build
23 changes: 23 additions & 0 deletions Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
FROM node:6

MAINTAINER Andrew Reddikh <[email protected]>

# Define working directory
WORKDIR /app

ADD package.json /app/package.json

# Install dependencies updates
RUN npm install

# Add the actual code of goose as a node module
ADD lib /app/node_modules/goose-parser/lib
ADD vendor /app/node_modules/goose-parser/vendor
ADD index.js /app/node_modules/goose-parser

# Set env
ENV PATH=/usr/local/bin:/bin:/usr/bin:/app/node_modules/phantomjs-prebuilt/bin

ADD docker/index.js /app

ENTRYPOINT ["node", "index.js"]
20 changes: 19 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
@@ -1,2 +1,20 @@
# goose-phantom-environment
# Goose Phantom Environment

Environment for Goose parser which allows to run it in PhantomJS

## PhantomEnvironment
That environment is used for running Parser on node.
```JS
var env = new PhantomEnvironment({
url: 'http://google.com',
});
```
The main and only required parameter is `url`. It contains an url address of the site, where Parser will start.

This environment allows to perform snapshots, use proxy lists, custom proxy rotator, white and black lists for loading resources and more sweet features. Find more info about options in [here](https://github.com/redco/goose-parser/blob/master/lib/PhantomEnvironment.js#L35).

## Tests
To run [tests](https://github.com/redco/goose-parser/blob/master/tests/phantom_parser_test.js) use command:
```bash
npm test
```
17 changes: 17 additions & 0 deletions build.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
/* eslint import/no-extraneous-dependencies: ['error', {devDependencies: true}] */
const rimraf = require('rimraf');
const cp = require('child_process');
const fs = require('fs');
const pkg = require('./package.json');

rimraf.sync('build');
cp.spawnSync('babel', ['lib', '-d', 'build/lib'], { stdio: 'inherit' });
delete pkg.private;
delete pkg.devDependencies;
delete pkg.scripts;
delete pkg['pre-commit'];
delete pkg['lint-staged'];
fs.writeFileSync('build/package.json', JSON.stringify(pkg, null, ' '), 'utf-8');
fs.writeFileSync('build/LICENSE', fs.readFileSync('LICENSE', 'utf-8'), 'utf-8');
fs.writeFileSync('build/README.md', fs.readFileSync('README.md', 'utf-8'), 'utf-8');
cp.spawnSync('cp', ['-R', 'vendor', 'build/vendor']);
28 changes: 28 additions & 0 deletions circle.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
# https://circleci.com/docs/install-and-use-yarn/
machine:
node:
version: 7.7.2
environment:
YARN_VERSION: 0.18.1
PATH: "${PATH}:${HOME}/.yarn/bin:${HOME}/${CIRCLE_PROJECT_REPONAME}/node_modules/.bin"

dependencies:
pre:
- |
if [[ ! -e ~/.yarn/bin/yarn || $(yarn --version) != "${YARN_VERSION}" ]]; then
echo "Download and install Yarn."
curl -o- -L https://yarnpkg.com/install.sh | bash -s -- --version $YARN_VERSION
else
echo "The correct version of Yarn is already installed."
fi
override:
- yarn install
cache_directories:
- ~/.yarn
- ~/.cache/yarn

test:
pre:
- yarn run lint
override:
- yarn run build
66 changes: 66 additions & 0 deletions docker/index.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,66 @@


const argv = require('minimist')(process.argv.slice(2));

const cleanStdout = process.env.CLEAN_STDOUT !== undefined;
const Goose = require('goose-parser');

const url = argv._[0];

let rules;
const rulesFile = argv['rules-file'];
if (rulesFile) {
rules = require(rulesFile);
} else {
try {
rules = JSON.parse(argv._[1]);
} catch (e) {
console.error('Error occurred while paring rules');
throw e;
}
}

const envOptionsStr = argv['env-options'];
let envOptions = {
url,
snapshot: false,
loadImages: true,
screen: {
width: 1080,
height: 768,
},
webSecurity: false,
};

if (envOptionsStr) {
try {
envOptions = Object.assign(envOptions, JSON.parse(envOptionsStr));
} catch (e) {
console.error('Error occurred while parsing environment options');
throw e;
}
}

const parser = new Goose.Parser({
environment: new Goose.PhantomEnvironment(envOptions),
});

const time = (new Date()).getTime();
parser
.parse(rules)
.done((results) => {
if (!cleanStdout) {
console.log('Work is done');
console.log('Execution time: ' + ((new Date()).getTime() - time));
console.log('Results:');
console.log(require('util').inspect(results, { showHidden: false, depth: null }));
} else {
console.log(JSON.stringify(results, null, ' '));
}
}, (e) => {
if (!cleanStdout) {
console.log('Error occurred');
console.log(e.message, e.stack);
}
console.log(JSON.stringify({ 'goose-error': e.message }));
});
Loading