-
[Sample template] (https://github.com/t-ivanov/HadoopExamples/blob/master/Document_Template.md) that can be used when creating a workload description.
-
List all available example programs: $yarn jar hadoop-mapreduce-examples.jar
-
aggregatewordcount: An Aggregate based map/reduce program that counts the words in the input files.
-
aggregatewordhist: An Aggregate based map/reduce program that computes the histogram of the words in the input files.
-
bbp: A map/reduce program that uses Bailey-Borwein-Plouffe to compute exact digits of Pi.
-
dbcount: An example job that count the pageview counts from a database.
-
distbbp: A map/reduce program that uses a BBP-type formula to compute exact bits of Pi.
-
grep: A map/reduce program that counts the matches of a regex in the input.
-
join: A job that effects a join over sorted, equally partitioned datasets
-
multifilewc: A job that counts words from several files.
-
pentomino: A map/reduce tile laying program to find solutions to pentomino problems.
-
pi: A map/reduce program that estimates Pi using a quasi-Monte Carlo method.
-
randomtextwriter: A map/reduce program that writes 10GB of random textual data per node.
-
randomwriter: A map/reduce program that writes 10GB of random data per node.
-
secondarysort: An example defining a secondary sort to the reduce.
-
sort: A map/reduce program that sorts the data written by the random writer.
-
[sudoku] (https://github.com/danielOrbegoso/HadoopExamples/blob/master/Sudoku.md): A sudoku solver.
-
teragen: Generate data for the terasort
-
terasort: Run the terasort
-
teravalidate: Checking results of terasort
-
[wordcount] (https://github.com/t-ivanov/HadoopExamples/blob/master/Wordcount.md): A map/reduce program that counts the words in the input files.
-
wordmean: A map/reduce program that counts the average length of the words in the input files.
-
wordmedian: A map/reduce program that counts the median length of the words in the input files.
-
wordstandarddeviation: A map/reduce program that counts the standard deviation of the length of the words in the input files.
-
-
List all available tests programs: $yarn jar hadoop-mapreduce-client-jobclient-tests.jar
-
TestDFSIO: Distributed i/o benchmark.
-
DFSCIOTest: Distributed i/o benchmark of libhdfs.
-
DistributedFSCheck: Distributed checkup of the file system consistency.
-
JHLogAnalyzer: Job History Log analyzer.
-
MRReliabilityTest: A program that tests the reliability of the MR framework by injecting faults/failures.
-
NNdataGenerator: Generate the data to be used by NNloadGenerator
-
NNloadGenerator: Generate load on Namenode using NN loadgenerator run WITHOUT MR
-
NNloadGeneratorMR: Generate load on Namenode using NN loadgenerator run as MR job
-
NNstructureGenerator: Generate the structure to be used by NNdataGenerator
-
SliveTest: HDFS Stress Test and Live Data Verification.
-
fail: a job that always fails
-
filebench: Benchmark SequenceFile(Input|Output)Format (block,record compressed and uncompressed), Text(Input|Output)Format (compressed and uncompressed).
-
largesorter: Large-Sort tester
-
loadgen: Generic map/reduce load generator
-
mapredtest: A map/reduce test check.
-
minicluster: Single process HDFS and MR cluster.
-
mrbench: A map/reduce benchmark that can create many small jobs
-
nnbench: A benchmark that stresses the namenode.
-
sleep: A job that sleeps at each map and reduce task.
-
testbigmapoutput: A map/reduce program that works on a very big non-splittable file and does identity map/reduce
-
testfilesystem: A test for FileSystem read/write.
-
testmapredsort: A map/reduce program that validates the map-reduce framework's sort.
-
testsequencefile: A test for flat files of binary key value pairs.
-
testsequencefileinputformat: A test for sequence file input format.
-
testtextinputformat: A test for text input format.
-
threadedmapbench: A map/reduce benchmark that compares the performance of maps with multiple spills over maps with 1 spill
-
-
[Hadoop exmaples source code] (https://github.com/apache/hadoop-common/tree/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/fs)
-
[Good tutorial on how to run the examples] (http://www.michael-noll.com/blog/2011/04/09/benchmarking-and-stress-testing-an-hadoop-cluster-with-terasort-testdfsio-nnbench-mrbench/)
-
Notifications
You must be signed in to change notification settings - Fork 4
t-ivanov/HadoopExamples
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
About
No description, website, or topics provided.
Resources
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published