Categories
alphanumeric java random string

How to generate a random alpha-numeric string

1914

I’ve been looking for a simple Java algorithm to generate a pseudo-random alpha-numeric string. In my situation it would be used as a unique session/key identifier that would “likely” be unique over 500K+ generation (my needs don’t really require anything much more sophisticated).

Ideally, I would be able to specify a length depending on my uniqueness needs. For example, a generated string of length 12 might look something like "AEYGF7K0DM1X".

6

  • 162

    Beware the birthday paradox.

    Oct 25, 2010 at 15:07


  • 62

    Even taking the birthday paradox in consideration, if you use 12 alphanumeric characters (62 total), you would still need well over 34 billion strings to reach the paradox. And the birthday paradox doesn’t guarantee a collision anyways, it just says it’s over 50% chance.

    Oct 29, 2012 at 4:13

  • 6

    @NullUserException 50 % success chance (per try) is damn high: even with 10 attempts, success rate is 0.999. With that and the fact that you can try A LOT in a period of 24 hours in mind, you don’t need 34 billion strings to be pretty sure to guess at least one of them. That is the reason why some session tokens should be really, really long.

    – Pijusn

    Jan 31, 2015 at 10:28

  • 19

    These 3 single line codes are very much useful i guess.. Long.toHexString(Double.doubleToLongBits(Math.random())); UUID.randomUUID().toString(); RandomStringUtils.randomAlphanumeric(12);

    – Manindar

    Jun 8, 2016 at 7:31


  • 25

    @Pijusn I know this is old, but… the “50% chance” in the birthday paradox is NOT “per try”, it’s “50% chance that, out of (in this case) 34 billion strings, there exists at least one pair of duplicates”. You’d need 1.6 septillion – 1.6e21 – entries in your database in order for there to be a 50% chance per try.

    Oct 11, 2017 at 19:21


1622

Algorithm

To generate a random string, concatenate characters drawn randomly from the set of acceptable symbols until the string reaches the desired length.

Implementation

Here’s some fairly simple and very flexible code for generating random identifiers. Read the information that follows for important application notes.

public class RandomString {

    /**
     * Generate a random string.
     */
    public String nextString() {
        for (int idx = 0; idx < buf.length; ++idx)
            buf[idx] = symbols[random.nextInt(symbols.length)];
        return new String(buf);
    }

    public static final String upper = "ABCDEFGHIJKLMNOPQRSTUVWXYZ";

    public static final String lower = upper.toLowerCase(Locale.ROOT);

    public static final String digits = "0123456789";

    public static final String alphanum = upper + lower + digits;

    private final Random random;

    private final char[] symbols;

    private final char[] buf;

    public RandomString(int length, Random random, String symbols) {
        if (length < 1) throw new IllegalArgumentException();
        if (symbols.length() < 2) throw new IllegalArgumentException();
        this.random = Objects.requireNonNull(random);
        this.symbols = symbols.toCharArray();
        this.buf = new char[length];
    }

    /**
     * Create an alphanumeric string generator.
     */
    public RandomString(int length, Random random) {
        this(length, random, alphanum);
    }

    /**
     * Create an alphanumeric strings from a secure generator.
     */
    public RandomString(int length) {
        this(length, new SecureRandom());
    }

    /**
     * Create session identifiers.
     */
    public RandomString() {
        this(21);
    }

}

Usage examples

Create an insecure generator for 8-character identifiers:

RandomString gen = new RandomString(8, ThreadLocalRandom.current());

Create a secure generator for session identifiers:

RandomString session = new RandomString();

Create a generator with easy-to-read codes for printing. The strings are longer than full alphanumeric strings to compensate for using fewer symbols:

String easy = RandomString.digits + "ACEFGHJKLMNPQRUVWXYabcdefhijkprstuvwx";
RandomString tickets = new RandomString(23, new SecureRandom(), easy);

Use as session identifiers

Generating session identifiers that are likely to be unique is not good enough, or you could just use a simple counter. Attackers hijack sessions when predictable identifiers are used.

There is tension between length and security. Shorter identifiers are easier to guess, because there are fewer possibilities. But longer identifiers consume more storage and bandwidth. A larger set of symbols helps, but might cause encoding problems if identifiers are included in URLs or re-entered by hand.

The underlying source of randomness, or entropy, for session identifiers should come from a random number generator designed for cryptography. However, initializing these generators can sometimes be computationally expensive or slow, so effort should be made to re-use them when possible.

Use as object identifiers

Not every application requires security. Random assignment can be an efficient way for multiple entities to generate identifiers in a shared space without any coordination or partitioning. Coordination can be slow, especially in a clustered or distributed environment, and splitting up a space causes problems when entities end up with shares that are too small or too big.

Identifiers generated without taking measures to make them unpredictable should be protected by other means if an attacker might be able to view and manipulate them, as happens in most web applications. There should be a separate authorization system that protects objects whose identifier can be guessed by an attacker without access permission.

Care must be also be taken to use identifiers that are long enough to make collisions unlikely given the anticipated total number of identifiers. This is referred to as “the birthday paradox.” The probability of a collision, p, is approximately n2/(2qx), where n is the number of identifiers actually generated, q is the number of distinct symbols in the alphabet, and x is the length of the identifiers. This should be a very small number, like 2‑50 or less.

Working this out shows that the chance of collision among 500k 15-character identifiers is about 2‑52, which is probably less likely than undetected errors from cosmic rays, etc.

Comparison with UUIDs

According to their specification, UUIDs are not designed to be unpredictable, and should not be used as session identifiers.

UUIDs in their standard format take a lot of space: 36 characters for only 122 bits of entropy. (Not all bits of a “random” UUID are selected randomly.) A randomly chosen alphanumeric string packs more entropy in just 21 characters.

UUIDs are not flexible; they have a standardized structure and layout. This is their chief virtue as well as their main weakness. When collaborating with an outside party, the standardization offered by UUIDs may be helpful. For purely internal use, they can be inefficient.

9

  • 6

    If you need spaces in yours, you can tack on .replaceAll("\\d", " "); onto the end of the return new BigInteger(130, random).toString(32); line to do a regex swap. It replaces all digits with spaces. Works great for me: I’m using this as a substitute for a front-end Lorem Ipsum

    – weisjohn

    Oct 7, 2011 at 15:00


  • 4

    @weisjohn That’s a good idea. You can do something similar with the second method, by removing the digits from symbols and using a space instead; you can control the average “word” length by changing the number of spaces in symbols (more occurrences for shorter words). For a really over-the-top fake text solution, you can use a Markov chain!

    – erickson

    Oct 7, 2011 at 16:02

  • 4

    These identifiers are randomly selected from space of a certain size. They could be 1 character long. If you want a fixed length, you can use the second solution, with a SecureRandom instance assigned to the random variable.

    – erickson

    Dec 20, 2011 at 0:15

  • 17

    @ejain because 32 = 2^5; each character will represent exactly 5 bits, and 130 bits can be evenly divided into characters.

    – erickson

    Feb 21, 2012 at 21:38

  • 3

    @erickson BigInteger.toString(int) doesn’t work that way, it’s actually calling Long.toString(long, String) to determine the character values (which gives a better JavaDoc description of what it actually does). Essentially doing BigInteger.toString(32) just means you only get characters 0-9 + a-v rather than 0-9 + a-z.

    – Thor84no

    Aug 29, 2012 at 15:51


876

Java supplies a way of doing this directly. If you don’t want the dashes, they are easy to strip out. Just use uuid.replace("-", "")

import java.util.UUID;

public class randomStringGenerator {
    public static void main(String[] args) {
        System.out.println(generateString());
    }

    public static String generateString() {
        String uuid = UUID.randomUUID().toString();
        return "uuid = " + uuid;
    }
}

Output

uuid = 2d7428a6-b58c-4008-8575-f05549f16316

12

  • 35

    Beware that this solution only generates a random string with hexadecimal characters. Which can be fine in some cases.

    – Dave

    May 5, 2011 at 9:28

  • 6

    The UUID class is useful. However, they aren’t as compact as the identifiers produced by my answers. This can be an issue, for example, in URLs. Depends on your needs.

    – erickson

    Aug 24, 2011 at 16:37

  • 6

    @Ruggs – The goal is alpha-numeric strings. How does broadening the output to any possible bytes fit with that?

    – erickson

    Oct 7, 2011 at 16:18

  • 75

    According to RFC4122 using UUID’s as tokens is a bad idea: Do not assume that UUIDs are hard to guess; they should not be used as security capabilities (identifiers whose mere possession grants access), for example. A predictable random number source will exacerbate the situation. ietf.org/rfc/rfc4122.txt

    – Somatik

    Dec 31, 2012 at 11:31

  • 37

    UUID.randomUUID().toString().replaceAll("-", ""); makes the string alpha-numeric, as requested.

    – Numid

    Jan 22, 2014 at 9:58

624

static final String AB = "0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz";
static SecureRandom rnd = new SecureRandom();

String randomString(int len){
   StringBuilder sb = new StringBuilder(len);
   for(int i = 0; i < len; i++)
      sb.append(AB.charAt(rnd.nextInt(AB.length())));
   return sb.toString();
}

7

  • 70

    +1, the simplest solution here for generating a random string of specified length (apart from using RandomStringUtils from Commons Lang).

    – Jonik

    Apr 20, 2012 at 15:49

  • 15

    Consider using SecureRandom instead of the Random class. If passwords are generated on a server, it might be vulnerable to timing attacks.

    – foens

    Jun 25, 2014 at 13:34

  • 10

    I would add lowercase also: AB = "0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz"; and some other allowed chars.

    – ACV

    Sep 7, 2015 at 20:56


  • 1

    Why not put static Random rnd = new Random(); inside the method?

    – Micro

    Feb 8, 2016 at 1:25

  • 6

    @MicroR Is there a good reason to create the Random object in each method invocation? I don’t think so.

    Feb 15, 2016 at 10:49