Package org.apache.sedona.core.utils
Class RDDSampleUtils
- java.lang.Object
-
- org.apache.sedona.core.utils.RDDSampleUtils
-
public class RDDSampleUtils extends Object
The Class RDDSampleUtils.
-
-
Constructor Summary
Constructors Constructor Description RDDSampleUtils()
-
Method Summary
All Methods Static Methods Concrete Methods Modifier and Type Method Description static int
getSampleNumbers(int numPartitions, long totalNumberOfRecords, int givenSampleNumbers)
Returns the number of samples to take to partition the RDD into specified number of partitions.
-
-
-
Method Detail
-
getSampleNumbers
public static int getSampleNumbers(int numPartitions, long totalNumberOfRecords, int givenSampleNumbers)
Returns the number of samples to take to partition the RDD into specified number of partitions.Number of partitions cannot exceed half the number of records in the RDD.
Returns total number of records if it is < 1000. Otherwise, returns 1% of the total number of records or twice the number of partitions whichever is larger. Never returns a number > Integer.MAX_VALUE.
If desired number of samples is not -1, returns that number.
- Parameters:
numPartitions
- the num partitionstotalNumberOfRecords
- the total number of recordsgivenSampleNumbers
- the given sample numbers- Returns:
- the sample numbers
- Throws:
IllegalArgumentException
- if requested number of samples exceeds total number of records or if requested number of partitions exceeds half of total number of records
-
-