Friday, June 12, 2009

Handling multiple encodings in Vim

Most people edit, load and save files in a single character encoding (i.e. en_US.UTF-8) but for many this is not the case. For me I need to write documents and emails in japanese UTF-8 (ja_JP.UTF-8), Latex in EUC-JP (ja_JP.EUC-JP) and source code comments in Shift-JIS (ja_JP.SJIS).

To handle a certain character encoding there are three things you must consider:

  • Your editor character encoding

  • Your terminal character encoding

  • Your font character encoding support



In the prehistoric era editors and terminal could support a single character encoding so you needed a different pair for each encoding you needed. Believe me when I tell you this was not fun at all.

These days most editors and terminals support a large array of character encodings and a lot of free fonts are available that support all the character sets I need. Still it is necessary to reconfigure each part (editor/terminal) or create different profiles for each character encoding you needed to edit (i.e. link) for the editor and the console.

Today I took the time to understand how vim character encoding support works and I found that it has everything I need and a lot more. By carefully manipulating the the fenc, fencs, enc and tenc configuration parameters I can edit any file in any character encoding with little effort. Here are my .vimrc configuration parameters:


"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""

"" Character encoding settings

"" By manipulating this variables it is possible to edit all files in one

"" encoding while using the terminal in a different encoding and writing/reading

"" the file in another encoding. Here we set all three variables to UTF-8.

"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""



" Default file encoding for new files

setglobal fenc=utf-8



" Auto detect file encoding when opening a file. To check what file encoding was

" selected run ":set fenc" and if you know the auto detection failed and want to

" force another one run ":edit ++enc=<your_enc>".

set fencs=utf-8,euc-jp,sjis



" Internal encoding used by vim buffers, help and commands

set enc=utf-8



" Terminal encoding used for input and terminal display

" Make sure your terminal is configured with the same encoding.

set tenc=utf-8





  • tenc:

    This is the character encoding used to display and input text to the terminal. I configure my terminal (Konsole) always in UTF-8 and as far as I know my input method for Japanese (scim/anthy) is also UTF-8 so to avoid visual/input problems I leave this in UTF-8.

  • enc:

    The encoding used internally by vim buffers, help and commands. This does not need to be the same as tenc as vim will convert from one encoding to the other if they differ. This way you can use your native language encoding in your terminal and input method and let vim handle everything internally using UTF-8.


  • fenc:

    Is the character encoding used for reading/writing files. Again this can differ from enc and tenc because vim will convert between them if they differ. This way you can have your terminal configured with your native language (i.e. Japanese, Russian, Chinese...), let vim work internally in UTF-8 and finally save your files in any coding you want by setting fenc.


  • fencs:

    This is used by vim to try to auto detect the character encoding when opening an already existing file. The order in which you put the options is important so read the help ":h fencs" to learn how to set this correctly. For example if I put euc-jp first in the list all my English documents will be detected as euc-jp instead of utf-8 because all English characters are a subset of euc-jp, the same goes for latin1 and cp1250 encodings so make sure to put these at the end of the list. If the auto detection fails and your document is not displayed correctly you can always reload it forcing an encoding using ":edit ++enc=euc-jp" of course replace euc-jp with your desired encoding.


In my example above I set everything to UTF-8 that is recommended because converting from other encodings may cause loss of information. The only parameter I change is fenc when I need edit/save a file in a different encoding.

For example if I want to create a new file in euc-jp encoding:


- Open new file as normal using vim<
- Set file encoding using :set fenc=euc-jp
- Edit/Save as much as you like and rest assured that your file is euc-jp.



To edit an existing file simply open it and let vim auto detect the encoding using the options available in fencs. To check what encoding was set by vim you can use the command ":set fenc" and it will display the auto detected encoding. If it is not the correct one you can force the encoding by reloading the file using the command ":edit ++enc=euc-jp" replacing "euc-jp" with the encoding you desire.

Vim tips for Latex editing

Simple Latex Makefile



With this simple Makefile you can compile your latex projects from within vim and use the quickfix window in case of errors to traverse and fix them one by one.



#

# This makefile generates a PDF of the MAINTEX file.

# The output can be found inside the build subdirectory.

#

# Makefile based on  http://www.wlug.org.nz/LatexMakefiles

#

#

# This makefile has been tested in Kubuntu 9.04

#

# Prerequisites:

#

# sudo aptitude install ptex-bin ptex-base okumura-clsfiles ptex-jisfonts \

#     xdvik-ja dvipsk-ja dvi2ps gv jbibtex-bin jmpost mendexk okumura-clsfiles \

#     vfdata-morisawa5 dvi2ps-fontdesc-morisawa5 texlive-latex-extra latexmk \

#     dvipng xpdf gs-cjk-resource vfdata-morisawa5 dvi2ps-fontdesc-morisawa5 \

#     cmap-adobe-japan1 cmap-adobe-japan2 cmap-adobe-cns1 cmap-adobe-gb1

# sudo jisftconfig add

#

# When writing Japanese make sure your editor is saving the tex files

# in euc-jp encoding. In VIM this can be accomplished by setting the fenc

# variable:

#

#  - Open file as normal using vim <filename>

#  - Type  :edit ++enc=euc-jp

#  - Type  :set fenc=euc-jp

#  - Type  :set enc=utf-8



## TODO Add rule to convert images

## TODO Add makeindex to create .toc files



###########################################################################

## Put here the file name of the main tex file.

MAINTEX      = main



############################################################################

## Change the followin only if you know what you are doing

LATEXCMD     = platex     # [latex | platex]

BIBCMD       = jbibtex    # [bibtex | jbibtex]

DVIPDFCMD    = dvipdf     # [dvipdf | dvipdfm | dvipdfmx]

PDFVIEWER    = xpdf       # [okular | xpdf | evince ]

DVIVIEWER    = xdvi-ja    # [xdvi | xdvi-ja | kdvi ]



## No need to change anything below this line

TEXFILES     = $(wildcard *.tex)

TEXINPUTS=:$(PWD)//:    # Path to search for .tex, .cls and .sty files

BSTINPUTS=:$(PWD)//:    # Path to search for .bst files

BIBINPUTS=:$(PWD)//:    # Path to search for .bib files

TEXMFOUTPUT=$(PWD)/build  # Output dir for bibtex and jbibtex

LATEXOPTS= -output-directory=build -file-line-error -interaction=nonstopmode



export TEXFILES TEXINPUTS BSTINPUTS BIBINPUTS TEXMFOUPUT



.PHONY: all wordcount charcount pdf dvi



all: build/$(MAINTEX).pdf



build :

        @echo "Creating build directory"

        @mkdir build



build/$(MAINTEX).aux: $(MAINTEX).tex $(TEXFILES) build

        $(LATEXCMD) $(LATEXOPTS) $(MAINTEX) 



build/$(MAINTEX).bbl: build/$(MAINTEX).aux

        $(BIBCMD) build/$(MAINTEX)



build/$(MAINTEX).dvi: build/$(MAINTEX).bbl

        $(LATEXCMD) $(LATEXOPTS) $(MAINTEX) 

        $(LATEXCMD) $(LATEXOPTS) $(MAINTEX) 



build/$(MAINTEX).pdf: build/$(MAINTEX).dvi

        $(DVIPDFCMD) build/$(MAINTEX).dvi build/$(MAINTEX).pdf



dvi: build/$(MAINTEX).dvi

        $(DVIVIEWER) build/$(MAINTEX).dvi



pdf: build/$(MAINTEX).pdf

        $(PDFVIEWER) build/$(MAINTEX).pdf



# Word counting can be done in VIM usng Ctrl-g g but this command also includes

# latex commands. This make rule strips the latex and counts what is left

wordcount:

        @echo Approximate word count: `grep -v '^\\\\' $(TEXFILES)|grep -v '^%'|wc -w`



charcount:

        @echo Approximate char count: `grep -v '^\\\\' $(TEXFILES)|grep -v '^%'|wc -c`






Put all your tex files, images, sty, cls, bib, bst files inside a directory with any subfolder tree structure you like. This Makefile sets some environment variables that allow it to find all these files as long as they are below the current directory tree.

Make sure your main tex file is called main.tex or if you prefer another one replace the MAINTEX variable in the Makefile. Take note that it has no extension!.

Also note that this Makefile is to compile Japanese EUC-JP tex files (IEICE Trans). If you do not need Japanese replace platex with simple latex and jbibtex with bibtex.

All output files generated by latex/platex, bibtex/jbibtex are stored inside a build subdirectory.

Now in your .vimrc you can add the following parameter to compile your latex project:



""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""

" Working with Makefiles

"

" Press F5 to compile and open the error window if there

" are errors. If there are errors you can use :cn and :cN to

" jump foward and backward thru the error list. Press :ccl to

" close the error list or F5 again to recompile

map <F5> <ESC>:make<CR><ESC>:botright cwindow<CR> " Compile and open quick fix list

map <F6> <ESC>:cN<CR>                             " Jump to prev error/warn

map <F7> <ESC>:cn<CR>                             " Jump to next error/warn






Now when editing you latex files press <F5> to compile them and if errors occur you will be presented with a quickfix window where you can use <F6> and <F7> to quickly jump to the previous and next error message.

Additionally this Makefile also has a few useful commands you can invoke from within vim:



  • :!make pdf to display the pdf file using xpdf or if you prefer a different viewer replace the PDFVIEWER variable in the Makefile.


  • :!make dvi to display the dvi file using xdvi or if you prefer a different viewer replace the DVIVIEWER variable in the Makefile.


  • :!make wordcount to display the number of words in all the tex documents. In vim you can also use g<ctrl-g> in normal mode but this also counts latex commands.


  • :!make charcount same as wordcount but for characters instead of words.



Latex with Vim TagList plugin





We can enable TagList to display important keywords like section, labels and references for easy navigation (link).

First create a ~/.ctags file that contains the following:


--langdef=tex
--langmap=tex:.tex
--regex-tex=/\\subsubsection[ \t]*\*?\{[ \t]*([^}]*)\}/- \1/s,subsubsection/
--regex-tex=/\\subsection[ \t]*\*?\{[ \t]*([^}]*)\}/+\1/s,subsection/
--regex-tex=/\\section[ \t]*\*?\{[ \t]*([^}]*)\}/\1/s,section/
--regex-tex=/\\chapter[ \t]*\*?\{[ \t]*([^}]*)\}/\1/c,chapter/
--regex-tex=/\\label[ \t]*\*?\{[ \t]*([^}]*)\}/\1/l,label/
--regex-tex=/\\ref[ \t]*\*?\{[ \t]*([^}]*)\}/\1/r,ref/


Then modify the taflist.vim file and add a new language. To do this search for "yacc language" and add these line before it:


" tex language
let s:tlist_def_tex_settings = 'tex;s:section;c:chapter;l:label;r:ref'

" yacc language


Now when you open the taglist window you will get a nice list of section, subsections, chapters, references, labels, etc... that you can navigate and use to jump quickly to each of them.

If you do not have the taglist plugin installed you can follow my instructions here.

Reference autocompletion





If you write your labels as \label{fig:something}, then put this in your ~/.vimrc file:



set iskeyword+=:           " type /ref{fig: and prec <C-n> to autocomplete references



With this now you can type \ref{fig: and press to get a list of labels you can cycle as shown in the image above.


Spell checking



This is not only for latex files. Simply put the following in your ~/.vimrc file to enable automatic spell checking. Learn how to use the spell checker to correct, add and/or ignore words.



""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""

" Enable spell checking in latex, bib and txt files

" Commands:

"            [s      ->  jump to next bad word

"            ]s      ->  jump to prev bad word

"            z=      ->  suggest word

"            zg      ->  mark word as good (add to dictionary)

"            zw      ->  mark word as bad  (remove from dictionary)

"            :h spell -> get more details about spelling

""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""

au BufRead,BufNewFile *.txt,*.tex,*.bib  setlocal spell spelllang=en_us