Hadoop实现MR程序模拟实现天气数据获取两次最高温度-APISpace

Hadoop实现MR程序模拟实现天气数据获取两次最高温度

数据 1949-10-01 14:21:02 34c 1949-10-01 19:21:02 38c 1949-10-02 14:01:02 36c 1950-01-01 11:21:02 32c 1950-10-01 12:21:02 37c 1951-12-01 12:21:02 23c 1950-10-02 12:21:02 41c 1950-10-03 12:21:02 27c 1951-07-01 12:21:02 45c 1951-07-02 12:21:02 46c 1951-07-03 12:21:03 47c 主程序 package com.zyd.tq; import java.io.IOException; import org.apache.hadoop.conf.Configuration; import org.apache.hadoop.fs.Path; import org.apache.hadoop.io.Text; import org.apache.hadoop.mapreduce.Job; import org.apache.hadoop.mapreduce.lib.input.FileInputFormat; import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat; public class TQRunner { public static void main(String[] args) throws IOException, ClassNotFoundException, InterruptedException { //1 获取文件系统 Configuration conf = new Configuration(); Job job = Job.getInstance(conf); job.setJarByClass(TQRunner.class); job.setJobName("TQ"); //读 FileInputFormat.addInputPath(job, new Path(args[0])); FileOutputFormat.setOutputPath(job, new Path(args[1])); //map操作 job.setMapperClass(TQMapper.class); //Reduce操作 job.setReducerClass(TQReducer.class); //map的输出的key和value都是Text job.setMapOutputKeyClass(Text.class); //如果key不对启动任务时候报unable to initialize any output collect job.setMapOutputValueClass(Text.class); job.waitForCompletion(true); } } map阶段代码: package com.zyd.tq; import java.io.IOException; import org.apache.hadoop.io.Text; import org.apache.hadoop.mapreduce.Mapper; /** * * @author Administrator *传入输入的key value 输入的key value *key 精确到年月 value是度数是字符串 */ public class TQMapper extends Mapper{ @Override /** * 重写map方法 * */ protected void map(Object key, Text value, Mapper.Context context) throws IOException, InterruptedException { //时间和温度是按制表符隔开的 String[] split = value.toString().split("\t"); //时间 String time = split[0]; //温度 String wd = split[1]; context.write(new Text(time.substring(0,7)),new Text(wd)); } } Reduce阶段: package com.zyd.tq; import java.io.IOException; import java.util.ArrayList; import java.util.Collections; import java.util.Iterator; import org.apache.hadoop.io.Text; import org.apache.hadoop.mapreduce.Reducer; /** * * @author Administrator *根据map分区后 */ public class TQReducer extends Reducer { @Override /** * 相同的key为一组 * * iterable是同一个月里面所有的温度,是所有的map直接拉过来以后的 */ protected void reduce(Text key, Iterable iterable, Reducer.Context context) throws IOException, InterruptedException { ArrayList list = new ArrayList(); for(Text text : iterable){ list.add(text.toString()); } //按照字典排序 Collections.sort(list); //得到最高的两个温度即下标的最后两个 String maxWD = list.get(list.size()-1); String tmp = ""; if (list.size()>=2) { String secondWD = list.get(list.size()-2); tmp = ":"+secondWD; } context.write(key,new Text(maxWD+tmp)); } }

c语言sscanf函数的用法是什么

262 2022-11-24

Hadoop实现MR程序模拟实现天气数据获取两次最高温度

c语言sscanf函数的用法是什么

linux怎么查看本机内存大小

linux cpu占用率如何看

推荐文章

api接口有哪几种分类及功能

什么是API接口?API接口简单介绍

短信API接口概述，短信API接口的优势

7款快递物流的物流查询API工具，物流快递查询API接口怎么对接？

企业四要素: 了解企业经营成功的关键

什么是语音验证码?,语音验证码平台有哪些

全国工商查询系统怎么查企业名录

哪些平台提供实名认证的接口？

PHP如何调用API接口?

如何使用百度天气预报API接口?

最近发表

热评文章

数据接口api（数据接口API开发平台）

数据开放接口api（数据服务api开发）

Python爬虫教程：爬取酷狗音乐（python爬取

hbuilder怎么更改字体大小和颜色

直播平台api接口 - 构建卓越的直播平台

实时股票数据api接口（股票实时行情api接口）