Wednesday, April 15, 2015

Randomly Select Object from a Set Performance

I mentioned in my last post that there were two ways to randomly select an object from a set in Java.

Option 1 - cycling through the set until we reached a randomly selected position
OR
Option 2 - converting the set to a list and directly grabbing a randomly selected position

Option 2 seems the fastest of the two options, but only when the set is large.  If we are randomly selecting an object the average position selection will be the middle value.

Notice here that randomly selecting a number between 1 and 10 a million times yields an average number of 5.498191.

int total = 0;
        for (int i = 0; i < 1000000; i++) {
            int num = randomGenerator.nextInt(10) + 1;
            total += num;
        }
       
        double avgNum = (total * 1.0)/1000000.0;
        System.out.println("Avg Random Number from 1 - 10: " + avgNum);


Thus I wrote a method that predicts at what set size we should abandon Option 1 and start using Option 2.  This size, of course, depends on different OS specifications, JVM initialization parameters, and machine hardware.

    public long getRecommendedSetSizeLimit() {
        long recommendation = -1;

        long accuracy = DEFAULT_SET_SIZE_ACCURACY;
        if (ApplicationLoader.INSTANCE.doesApplicationPropertyExist(SET_SIZE_ACCURACY_PROPERTY_NAME)) {
            final String property = ApplicationLoader.INSTANCE.getApplicationProperty(SET_SIZE_ACCURACY_PROPERTY_NAME);
            if (property != null && !property.isEmpty()) {
                try {
                    accuracy = Long.parseLong(property);
                } catch (Exception e) {
                    LOGGER.warn("unable to parse property " + SET_SIZE_ACCURACY_PROPERTY_NAME + " ()", e);
                }
            }
        }

        long acceptanceMilliseconds = DEFAULT_METHOD_ACCEPTANCE_DIFFERENCE_MILLISECONDS;
        if (ApplicationLoader.INSTANCE.doesApplicationPropertyExist(METHOD_ACCEPTANCE_DIFFERENCE_PROPERTY_NAME)) {
            final String property = ApplicationLoader.INSTANCE.getApplicationProperty(METHOD_ACCEPTANCE_DIFFERENCE_PROPERTY_NAME);
            if (property != null && !property.isEmpty()) {
                try {
                    acceptanceMilliseconds = Long.parseLong(property);
                } catch (Exception e) {
                    LOGGER.warn("unable to parse property " + METHOD_ACCEPTANCE_DIFFERENCE_PROPERTY_NAME + " ()", e);
                }
            }
        }

        long setSize = 10000;

        do {
            final Set<Object> hugeSetOObjects = this.createSetOfRandomObjects(setSize);

            final int sizeOfSet = hugeSetOObjects.size();
            final int halfSizeOfSet = sizeOfSet / 2;

            LOGGER.info("Size of set being evaluated: " + sizeOfSet);

            final Date start1 = new Date();
            for (int i = 0; i < 500; i++) {
                this.selectMiddleObjectInSet(hugeSetOObjects, halfSizeOfSet);
            }
            final Date end1 = new Date();
            final long diff1 = end1.getTime() - start1.getTime();
            LOGGER.info("Time to Execute (set process): " + diff1);

            final Date start2 = new Date();
            for (int i = 0; i < 500; i++) {
                this.convertSetToListAndSelectMiddleInList(hugeSetOObjects, halfSizeOfSet);
            }
            final Date end2 = new Date();
            final long diff2 = end2.getTime() - start2.getTime();
            LOGGER.info("Time to execute (list process): " + diff2);

            long diff = diff1 - diff2;
            if (diff < 0) {
                diff = diff * -1;
            }

            if (diff1 > diff2) {
                LOGGER.warn("Difference between methods: " + diff);
                LOGGER.warn("Consider lowering the application property (acceptanceDiffMilliseconds)");
            }

            if (diff > acceptanceMilliseconds && diff1 > diff2) {
                recommendation = hugeSetOObjects.size();
                break;
            } else if (setSize >= 1000000) {
                recommendation = 1000000;
                break;
            } else {
                setSize += accuracy;
            }

        } while (true);

        return recommendation;
    }

    private Set<Object> createSetOfRandomObjects(final long size) {
        final Set<Object> hugeSetOObjects = new HashSet<>();

        for (int i = 0; i < size; i++) {
            final UUID id = UUID.randomUUID();
            hugeSetOObjects.add(id);
        }

        return hugeSetOObjects;
    }

    private Object selectMiddleObjectInSet(final Set<Object> setOfObjects,
            final int half) {

        int pos = 0;
        for (Object obj : setOfObjects) {
            if (pos == half) {
                return obj;
            } else {
                pos++;
            }
        }

        return null;
    }

    private Object convertSetToListAndSelectMiddleInList(final Set<Object> setOfObjects,
            final int half) {

        final List<Object> listCopy = new ArrayList<>();
        listCopy.addAll(setOfObjects);
        return listCopy.get(half);
    }


In my next post I will reveal the stats from different machines, operating systems, and JVM options.