@InterfaceAudience.Public @InterfaceStability.Stable public class MultipleOutputs extends Object
OutputCollector passed to
 the map() and reduce() methods of the
 Mapper and Reducer implementations.
 
 Each additional output, or named output, may be configured with its own
 OutputFormat, with its own key class and with its own value
 class.
 
A named output can be a single file or a multi file. The later is referred as a multi named output.
 A multi named output is an unbound set of files all sharing the same
 OutputFormat, key class and value class configuration.
 
 When named outputs are used within a Mapper implementation,
 key/values written to a name output are not part of the reduce phase, only
 key/values written to the job OutputCollector are part of the
 reduce phase.
 
 MultipleOutputs supports counters, by default the are disabled. The counters
 group is the MultipleOutputs class name.
 
Job configuration usage pattern is:
JobConf conf = new JobConf(); conf.setInputPath(inDir); FileOutputFormat.setOutputPath(conf, outDir); conf.setMapperClass(MOMap.class); conf.setReducerClass(MOReduce.class); ... // Defines additional single text based output 'text' for the job MultipleOutputs.addNamedOutput(conf, "text", TextOutputFormat.class, LongWritable.class, Text.class); // Defines additional multi sequencefile based output 'sequence' for the // job MultipleOutputs.addMultiNamedOutput(conf, "seq", SequenceFileOutputFormat.class, LongWritable.class, Text.class); ... JobClient jc = new JobClient(); RunningJob job = jc.submitJob(conf); ...
Job configuration usage pattern is:
 public class MOReduce implements
   Reducer<WritableComparable, Writable> {
 private MultipleOutputs mos;
 public void configure(JobConf conf) {
 ...
 mos = new MultipleOutputs(conf);
 }
 public void reduce(WritableComparable key, Iterator<Writable> values,
 OutputCollector output, Reporter reporter)
 throws IOException {
 ...
 mos.getCollector("text", reporter).collect(key, new Text("Hello"));
 mos.getCollector("seq", "A", reporter).collect(key, new Text("Bye"));
 mos.getCollector("seq", "B", reporter).collect(key, new Text("Chau"));
 ...
 }
 public void close() throws IOException {
 mos.close();
 ...
 }
 }
 | Constructor and Description | 
|---|
| MultipleOutputs(JobConf job)Creates and initializes multiple named outputs support, it should be
 instantiated in the Mapper/Reducer configure method. | 
| Modifier and Type | Method and Description | 
|---|---|
| static void | addMultiNamedOutput(JobConf conf,
                   String namedOutput,
                   Class<? extends OutputFormat> outputFormatClass,
                   Class<?> keyClass,
                   Class<?> valueClass)Adds a multi named output for the job. | 
| static void | addNamedOutput(JobConf conf,
              String namedOutput,
              Class<? extends OutputFormat> outputFormatClass,
              Class<?> keyClass,
              Class<?> valueClass)Adds a named output for the job. | 
| void | close()Closes all the opened named outputs. | 
| OutputCollector | getCollector(String namedOutput,
            Reporter reporter)Gets the output collector for a named output. | 
| OutputCollector | getCollector(String namedOutput,
            String multiName,
            Reporter reporter)Gets the output collector for a multi named output. | 
| static boolean | getCountersEnabled(JobConf conf)Returns if the counters for the named outputs are enabled or not. | 
| static Class<? extends OutputFormat> | getNamedOutputFormatClass(JobConf conf,
                         String namedOutput)Returns the named output OutputFormat. | 
| static Class<?> | getNamedOutputKeyClass(JobConf conf,
                      String namedOutput)Returns the key class for a named output. | 
| Iterator<String> | getNamedOutputs()Returns iterator with the defined name outputs. | 
| static List<String> | getNamedOutputsList(JobConf conf)Returns list of channel names. | 
| static Class<?> | getNamedOutputValueClass(JobConf conf,
                        String namedOutput)Returns the value class for a named output. | 
| static boolean | isMultiNamedOutput(JobConf conf,
                  String namedOutput)Returns if a named output is multiple. | 
| static void | setCountersEnabled(JobConf conf,
                  boolean enabled)Enables or disables counters for the named outputs. | 
public MultipleOutputs(JobConf job)
job - the job configuration objectpublic static List<String> getNamedOutputsList(JobConf conf)
conf - job confpublic static boolean isMultiNamedOutput(JobConf conf, String namedOutput)
conf - job confnamedOutput - named outputtrue if the name output is multi, false
         if it is single. If the name output is not defined it returns
         falsepublic static Class<? extends OutputFormat> getNamedOutputFormatClass(JobConf conf, String namedOutput)
conf - job confnamedOutput - named outputpublic static Class<?> getNamedOutputKeyClass(JobConf conf, String namedOutput)
conf - job confnamedOutput - named outputpublic static Class<?> getNamedOutputValueClass(JobConf conf, String namedOutput)
conf - job confnamedOutput - named outputpublic static void addNamedOutput(JobConf conf, String namedOutput, Class<? extends OutputFormat> outputFormatClass, Class<?> keyClass, Class<?> valueClass)
conf - job conf to add the named outputnamedOutput - named output name, it has to be a word, letters
                          and numbers only, cannot be the word 'part' as
                          that is reserved for the
                          default output.outputFormatClass - OutputFormat class.keyClass - key classvalueClass - value classpublic static void addMultiNamedOutput(JobConf conf, String namedOutput, Class<? extends OutputFormat> outputFormatClass, Class<?> keyClass, Class<?> valueClass)
conf - job conf to add the named outputnamedOutput - named output name, it has to be a word, letters
                          and numbers only, cannot be the word 'part' as
                          that is reserved for the
                          default output.outputFormatClass - OutputFormat class.keyClass - key classvalueClass - value classpublic static void setCountersEnabled(JobConf conf, boolean enabled)
By default these counters are disabled.
 MultipleOutputs supports counters, by default the are disabled.
 The counters group is the MultipleOutputs class name.
 
conf - job conf to enableadd the named output.enabled - indicates if the counters will be enabled or not.public static boolean getCountersEnabled(JobConf conf)
By default these counters are disabled.
 MultipleOutputs supports counters, by default the are disabled.
 The counters group is the MultipleOutputs class name.
 
conf - job conf to enableadd the named output.public Iterator<String> getNamedOutputs()
public OutputCollector getCollector(String namedOutput, Reporter reporter) throws IOException
namedOutput - the named output namereporter - the reporterIOException - thrown if output collector could not be createdpublic OutputCollector getCollector(String namedOutput, String multiName, Reporter reporter) throws IOException
namedOutput - the named output namemultiName - the multi name partreporter - the reporterIOException - thrown if output collector could not be createdpublic void close()
           throws IOException
 If overriden subclasses must invoke super.close() at the
 end of their close()
IOException - thrown if any of the MultipleOutput files
                             could not be closed properly.Copyright © 2024 Apache Software Foundation. All rights reserved.