Java的语言特点是什么(java语言的主要特点是什么)
580
2022-05-30
/** * Count the number of elements for each key, collecting the results to a local Map. * * @note This method should only be used if the resulting map is expected to be small, as * the whole thing is loaded into the driver's memory. * To handle very large results, consider using rdd.mapValues(_ => 1L).reduceByKey(_ + _), which * returns an RDD[T, Long] instead of a map. */
计算每个键的元素数,将结果放到Map中去。
注意:
只有当数据量很小时,才应使用此方法,因为整个数据都被载入内存中。
如果要处理大量数据,请考虑使用rdd.mapValues(_ => 1L).reduceByKey(_ + _),
返回的结果是 RDD[T, Long] 而不是Map。
// java public java.util.Map
public class CountByKey { public static void main(String[] args) { System.setProperty("hadoop.home.dir", "E:\hadoop-2.7.1"); SparkConf sparkConf = new SparkConf().setMaster("local").setAppName("Spark_DEMO"); JavaSparkContext sc = new JavaSparkContext(sparkConf); JavaPairRDD
19/03/20 16:36:11 INFO DAGScheduler: ResultStage 1 (countByKey at CountByKey.java:23) finished in 0.093 s 19/03/20 16:36:11 INFO DAGScheduler: Job 0 finished: countByKey at CountByKey.java:23, took 1.229949 s duck:1 cat:3 dog:1 pig:1 19/03/20 16:36:11 INFO SparkContext: Invoking stop() from shutdown hook
/** * Approximate version of countByKey that can return a partial result if it does * not finish within a timeout. * * The confidence is the probability that the error bounds of the result will * contain the true value. That is, if countApprox were called repeatedly * with confidence 0.9, we would expect 90% of the results to contain the * true count. The confidence must be in the range [0,1] or an exception will * be thrown. * * @param timeout maximum time to wait for the job, in milliseconds * @param confidence the desired statistical confidence in the result * @return a potentially incomplete result, with error bounds */
CountByKey的近似版本,如果没有在规定时间内完成就返回部分结果。
@参数超时等待作业的最长时间(毫秒)
@参数置信度结果中所需的统计置信度
@返回一个可能不完整的结果,带有错误界限
// java public PartialResult
EI企业智能 Java spark 可信智能计算服务 TICS 智能数据
版权声明:本文内容由网络用户投稿,版权归原作者所有,本站不拥有其著作权,亦不承担相应法律责任。如果您发现本站中有涉嫌抄袭或描述失实的内容,请联系我们jiasou666@gmail.com 处理,核实后本网站将在24小时内删除侵权内容。