[ruby] How to capitalize the first letter in a String in Ruby

The upcase method capitalizes the entire string, but I need to capitalize only the first letter.

Also, I need to support several popular languages, like German and Russian.

How do I do it?

This question is related to ruby

The answer is


You can use mb_chars. This respects umlaute:

class String

  # Only capitalize first letter of a string
  def capitalize_first
    self[0] = self[0].mb_chars.upcase
    self
  end

end

Example:

"ümlaute".capitalize_first
#=> "Ümlaute"

Below is another way to capitalize each word in a string. \w doesn't match Cyrillic characters or Latin characters with diacritics but [[:word:]] does. upcase, downcase, capitalize, and swapcase didn't apply to non-ASCII characters until Ruby 2.4.0 which was released in 2016.

"aAa-BBB ä ????? _a a_a".gsub(/\w+/,&:capitalize)
=> "Aaa-Bbb ä ????? _a A_a"
"aAa-BBB ä ????? _a a_a".gsub(/[[:word:]]+/,&:capitalize)
=> "Aaa-Bbb Ä ????? _a A_a"

[[:word:]] matches characters in these categories:

Ll (Letter, Lowercase)
Lu (Letter, Uppercase)
Lt (Letter, Titlecase)
Lo (Letter, Other)
Lm (Letter, Modifier)
Nd (Number, Decimal Digit)
Pc (Punctuation, Connector)

[[:word:]] matches all 10 of the characters in the "Punctuation, Connector" (Pc) category:

005F _ LOW LINE
203F ? UNDERTIE
2040 ? CHARACTER TIE
2054 ? INVERTED UNDERTIE
FE33 ? PRESENTATION FORM FOR VERTICAL LOW LINE
FE34 ? PRESENTATION FORM FOR VERTICAL WAVY LOW LINE
FE4D ? DASHED LOW LINE
FE4E ? CENTRELINE LOW LINE
FE4F ? WAVY LOW LINE
FF3F _ FULLWIDTH LOW LINE

This is another way to only convert the first character of a string to uppercase:

"striNG".sub(/./,&:upcase)
=> "StriNG"

Use capitalize. From the String documentation:

Returns a copy of str with the first character converted to uppercase and the remainder to lowercase.

"hello".capitalize    #=> "Hello"
"HELLO".capitalize    #=> "Hello"
"123ABC".capitalize   #=> "123abc"

My version:

class String
    def upcase_first
        return self if empty?
        dup.tap {|s| s[0] = s[0].upcase }
    end
    def upcase_first!
        replace upcase_first
    end
end

['NASA title', 'MHz', 'sputnik'].map &:upcase_first  #=> ["NASA title", "MHz", "Sputnik"]

Check also:
https://www.rubydoc.info/gems/activesupport/5.0.0.1/String%3Aupcase_first
https://www.rubydoc.info/gems/activesupport/5.0.0.1/ActiveSupport/Inflector#upcase_first-instance_method


Unfortunately, it is impossible for a machine to upcase/downcase/capitalize properly. It needs way too much contextual information for a computer to understand.

That's why Ruby's String class only supports capitalization for ASCII characters, because there it's at least somewhat well-defined.

What do I mean by "contextual information"?

For example, to capitalize i properly, you need to know which language the text is in. English, for example, has only two is: capital I without a dot and small i with a dot. But Turkish has four is: capital I without a dot, capital I with a dot, small i without a dot, small i with a dot. So, in English 'i'.upcase # => 'I' and in Turkish 'i'.upcase # => 'I'. In other words: since 'i'.upcase can return two different results, depending on the language, it is obviously impossible to correctly capitalize a word without knowing its language.

But Ruby doesn't know the language, it only knows the encoding. Therefore it is impossible to properly capitalize a string with Ruby's built-in functionality.

It gets worse: even with knowing the language, it is sometimes impossible to do capitalization properly. For example, in German, 'Maße'.upcase # => 'MASSE' (Maße is the plural of Maß meaning measurement). However, 'Masse'.upcase # => 'MASSE' (meaning mass). So, what is 'MASSE'.capitalize? In other words: correctly capitalizing requires a full-blown Artificial Intelligence.

So, instead of sometimes giving the wrong answer, Ruby chooses to sometimes give no answer at all, which is why non-ASCII characters simply get ignored in downcase/upcase/capitalize operations. (Which of course also reads to wrong results, but at least it's easy to check.)


Well, just so we know how to capitalize only the first letter and leave the rest of them alone, because sometimes that is what is desired:

['NASA', 'MHz', 'sputnik'].collect do |word|
  letters = word.split('')
  letters.first.upcase!
  letters.join
end

 => ["NASA", "MHz", "Sputnik"]

Calling capitalize would result in ["Nasa", "Mhz", "Sputnik"].


capitalize first letter of first word of string

"kirk douglas".capitalize
#=> "Kirk douglas"

capitalize first letter of each word

In rails:

"kirk douglas".titleize
=> "Kirk Douglas"

OR

"kirk_douglas".titleize
=> "Kirk Douglas"    

In ruby:

"kirk douglas".split(/ |\_|\-/).map(&:capitalize).join(" ") 
#=> "Kirk Douglas"

OR

require 'active_support/core_ext'
"kirk douglas".titleize

Rails 5+

As of Active Support and Rails 5.0.0.beta4 you can use one of both methods: String#upcase_first or ActiveSupport::Inflector#upcase_first.

"my API is great".upcase_first #=> "My API is great"
"?????".upcase_first           #=> "?????"
"?????".upcase_first           #=> "?????"
"NASA".upcase_first            #=> "NASA"
"MHz".upcase_first             #=> "MHz"
"sputnik".upcase_first         #=> "Sputnik"

Check "Rails 5: New upcase_first Method" for more info.