CamelCase
The problem
Detect and parse CamelCase in PHP. As the Wikipedia describes, CamelCase is
The practice of writing compound words or phrases where the words are joined without spaces, and each word is capitalized within the compound. The name comes from the uppercase bumps in the middle of the compound word, suggesting the humps of a camel.
This technique is often used to specify internal links to resources in a Wiki.
The solution
Using regular expressions, we can split the CamelCase string into individual words along the aA boundary. This brings to mind the preg_split function, which takes a string of characters and splits them into an array based off of a regular expression. The obvious solution (well, obvious if you understand regular expressions… but I digress) is to do the following:
preg_split("{
[a-z] #detect a lowercase character
[A-Z] #followed by an uppercase character
}x", $thetext);
This function will return an array split along any lowercaseUPPERCASE boundary, but unfortunately will not include the two characters used as the split line. So, if our example text was ExampleText, our final array in the above example would contain the entries "Exampl" and "ext". The solution is to use assertions, which allow us to match before and/or after “something” — without including it in the match. So the regex (foo)(?=bar) will match a foo only if immediately followed by a bar, but won’t include bar in the match. For this problem, we want to match a “something” preceded by a lowercase letter and followed by an uppercase one. In our case, the “something” is actually nothing — we want to split along the invisible space between aA. The solution:
$split_up = preg_split("{
(?<=[a-z]) # A look-behind assertion
# for a lowercase letter
(?=[A-Z]) # A look-ahead assertion
# for an uppercase letter
}x", "TheExampleTextOfCamelCase");
print_r($split_up);
Which outputs:
Array
(
[0] => The
[1] => Example
[2] => Text
[3] => Of
[4] => Camel
[5] => Case
)
