Spark RDD如何获取TaskId

·

1 min read

利用TaskContext.getPartitionId()获取分区Id。

rdd.partitionBy(partitioner)
   .foreachPartition(iter->{
      int id=TaskContext.getPartitionId();
      StringBuilder sb = new StringBuilder();
      sb.append("Pid=").append(id);
      while(iter.hasNext()){
        sb.append(", ").append(iter.next().toString());
      }
      System.out.println(sb.toString());
    });