I use SheetJS (xlsx) to load Excel files in JavaScript (TypeScript).
The usage is a bit peculiar and I forget how to use it easily, so this is a memo for my own use.
manual
install
npm install xlsx
File read
import * as xlsx from 'xlsx'; const book = xlsx.readFile('test.xlsx');
sheet
Sheets are retrieved by specifying the sheet name.
const sheet = book.Sheets['<sheet_name>'];
The sheet list is book.SheetNames
, so to specify a sheet by its index, do the following
const sheet = book.Sheets[book.SheetNames[0]];
The check to show/hide sheets is complicated.
Sheet objects do not have show/hide information.
There is an array called "book.Workbook.Sheets
" that contains the names and display states of the sheets that the workbook has, and we need to look at that array to find out.
Objects in the array have a "name" property for the sheet name and a "Hidden" property for the display status, with a "Hidden" value of "0" indicating display and "Hidden" indicating non-display.
For example, to check whether to show or hide by sheet name, the following is used.
const _sheets = book.Workbook.Sheets; const idx = _sheets.map((v:any)=>v.name).indexOf(<sheet_name>); console.log(_sheets[idx].Hidden); // 0 for display 1 or 2 for non-display
cell
Cells are addressed and retrieved using the A1
notation.
const cel = sheet['B2'];
Before describing the acquired cells, first an explanation on how to address the cells.
Address designation A1
notation" and "index notation
Use the "A1
notation" to specify to SheetJS (xlsx).
However, from a programmatic standpoint, it is easier to handle rows and columns by index (numerical value), so utility functions are provided to convert between A1
notation and "index notation.
In a program, an address is indicated in "index notation," converted to A1
notation, and then passed as a SheetJS argument.
index notation
cell
Rows are represented by r
and columns by c
.
The index starts with "0".
// 'A1' { c:0, r:0 }
area
The top left cell (start) is represented by s
and the bottom right cell (end) by e
.
// 'A1:B2' { s:{c:0, r:0}, e:{c:1, r:1} }
A1
and index
conversion
cell
const id = xlsx.util.decode_cell('A1'); // {c:0, r:0} const a1 = xlsx.util.encode_cell({c:0, r:0}); // 'A1'
area
const id = xlsx.util.decode_range('A1:B2'); // { s:{c:0, r:0}, e:{c:1, r:1} } const a1 = xlsx.util.encode_range({ s:{c:0 ,r:0}, e:{c:1, r:1} }); // 'A1:B2'
Data range of the sheet
The data area of the sheet can be obtained with sheet['!ref']
.
The area is expressed in A1
notation, so it is converted to "index notation" and used as the range when the for
loop is turned.
const range_a1 = sheet['!ref']; const range = xlsx.util.decode_range(range_a1);
value acquisition
Values are obtained in units of cells, addressed in A1
notation.
Values are returned in JSON.
const cel = sheet['A1']; console.log(cel); // {v:123, w:123, t:'n'}
JSON Format
key name | Content |
---|---|
v | raw data |
w | Display Data |
t | data type |
How to get the value
- The
w
is the value that will be displayed when opened in Excel and is returned as a String. v
is the actual data in the cell. For example, ifw
is "1900/1/1",v
is "1".- In general,
w
is the most accurate way to retrieve data. - However, there may be cases where
w
is not available, in which case the valuev
is used. - Since
w
is a String, if the type of the value to be obtained is clearly known, it can be converted and used. t
(data type) should not be trusted and should be ignored.
blank cell
Blank cells will result in the retrieved cell being "undefined".
Get Date Value
- The basic
w
is acquired and used. - In most cases,
t
does not becomed
(date type) even if it is a date display. - However, in very rare cases,
t
becomesd
(date type). - In this case, create a Date from the value of
v
as shown below.
import * as moment from 'moment'; const cell = sheet['A1']; let dt = null; if( cell.t==='d'){ dt = moment(cell.v); } else { dt = moment(cell.w); } console.log(dt.format('YYYY-MM-DD'));
JSON and CSV acquisition
There is a utility function to retrieve data from a sheet in JSON and CSV.
If the data on the sheet is tabular data with headers, it is easiest to use this to get the data in JSON.
CSV can be saved as a CSV file by outputting the file as is.
const json = xlsx.util.sheet_to_json(sheet); const csv = xlsx.util.sheet_to_json(sheet); fs.writeFileSync('test.csv', csv);
Sample (output all data in a sheet)
import * as xlsx from 'xlsx'; const book = xlsx.readFile('test.xlsx'); const sheet = book.Sheets[book.SheetNames[0]]; const range = xlsx.utils.decode_range(sheet['!ref']); for(let row=range.s.r; row<=range.e.r; row++){ for(let col=range.s.c; col<=range.e.c; col++){ const address = xlsx.utils.encode_cell({r:row, c:col}); const cell = sheet[address]; if(cell){ console.log( cell.w ? cell.w : cell.v ); } else { console.log( '' ); } } }
Impressions, etc.
I thought about putting together an Excel output as well, but decided not to because since I started using BI tools, I rarely output to Excel anymore, although I do read Excel.