I'm trying to search for the word Gadaffi. What's the best regular expression to search for this?
My best attempt so far is:
\b[KG]h?add?af?fi$\b
But I still seem to be missing some journals. Any suggestions?
Update: I found a pretty extensive list here: http://blogs.abcnews.com/theworldnewser/2009/09/how-many-different-ways-can-you-spell-gaddafi.html
The answer below matches all the 30 variants:
Gadaffi
Gadafi
Gadafy
Gaddafi
Gaddafy
Gaddhafi
Gadhafi
Gathafi
Ghadaffi
Ghadafi
Ghaddafi
Ghaddafy
Gheddafi
Kadaffi
Kadafi
Kaddafi
Kadhafi
Kazzafi
Khadaffy
Khadafy
Khaddafi
Qadafi
Qaddafi
Qadhafi
Qadhdhafi
Qadthafi
Qathafi
Quathafi
Qudhafi
Kad'afi
Answer
\b[KGQ]h?add?h?af?fi\b
Arabic transcription is (Wiki says) "Qaḏḏāfī", so maybe adding a Q. And one H ("Gadhafi", as the article (see below) mentions).
Btw, why is there a $
at the end of the regex?
Btw, nice article on the topic:
Gaddafi, Kadafi, or Qaddafi? Why is the Libyan leader’s name spelled so many different ways?.
EDIT
To match all the names in the article you've mentioned later, this should match them all. Let's just hope it won't match a lot of other stuff :D
\b(Kh?|Gh?|Qu?)[aeu](d['dt]?|t|zz|dhd)h?aff?[iy]\b
No comments:
Post a Comment