And vim.str_utf_start() + vim.str_utf_end().
You can use them like this:
---@param str string
---@param i integer
---@param j? integer
---@return string
local function str_sub(str, i, j)
local length = vim.str_utfindex(str)
if i < 0 then i = i + length + 1 end
if (j and j < 0) then j = j + length + 1 end
local u = (i > 0) and i or 1
local v = (j and j <= length) and j or length
if (u > v) then return "" end
local s = vim.str_byteindex(str, u - 1)
local e = vim.str_byteindex(str, v)
return str:sub(s + 1, e)
end
This function will give you byte size of a character at an index, you can start at 1 and iterate until you consume all bytes. Index 2 would be at 1 + the returned value.
function char_byte_count (s, i)
local c = string.byte(s, i or 1)
-- Get byte count of unicode character (RFC 3629)
if c > 0 and c <= 127 then
return 1
elseif c >= 194 and c <= 223 then
return 2
elseif c >= 224 and c <= 239 then
return 3
elseif c >= 240 and c <= 244 then
return 4
end
end
If you need to know how many cells a character will take up, there is the function vim.api.nvim_strwidth