Tukukkk

Question

I'm using openpyxl library package to read and write some data to an existing excel file test.xlsx.

Before writing some data to it, the content of file look like this:

enter image description here

cell A1 is contain Khmer Unicode character, and English character is in Bold style.

cell A3 used font lemons1 font-face, and English character is in Italic style.

I was using the script below to read and write data "It is me" to cell B2 of this excel file:

from openpyxl import load_workbook

import os

FILENAME1 = os.path.dirname(__file__)+'/test.xlsx'

from flask import make_response

from openpyxl.writer.excel import save_virtual_workbook

from app import app



@app.route('/testexel', methods=['GET'])

def testexel():

    with app.app_context():

        try:

            filename = 'test'

            workbook = load_workbook(FILENAME1, keep_links=False)

            sheet = workbook['Sheet1']

            sheet['B2']='It is me'



            response = make_response(save_virtual_workbook(workbook))

            response.headers['Cache-Control'] = 'no-cache'

            response.headers["Content-Disposition"] = "attachment; filename=%s.xlsx" % filename

            response.headers["Content-type"] = "application/vnd.openxmlformats-officedocument.spreadsheetml.sheet; charset=utf-8"



            return response



        except Exception as e:



            raise

Then format of resulted excel file was modified as this, which I've never wanted it to be like this :

enter image description here

The formatting style is quite different from the original file before writing data to it:

cell A1 all data is all bold taking style format from English character

cell B3 English character became a normal style, and font was change to font-face limons1 taking from that of khmer character in front of it.

What I am trying to accomplish is to keep existing content of file in the same format (style and font-face) as it was, while writing additional data to it.

Please kindly advise what is wrong with my script and how can I do to keep existing style and font-face unchanged after running above script? Thanks.

Possible duplicate of Editing workbooks with rich text in openpyxl — Oct 18 '18 at 17:42
@CharlieClark why duplicate? if the provide link does seem exactly solve the problem yet. — Oct 19 '18 at 1:22
@CharlieClark the answer seem not address my problem, at least. I’ve searched around some similar questioned, yet none suggested an answer with code to achieve this desire. Therefore I’m looking for something that workaround, at least a reason why it is not possible and which is alternative, avoiding one sentence answer “openpyxls does not support multiple format in one cell” it’s helpless, thanks :) — Oct 19 '18 at 13:58
What you want cannot be done with openpyxl. That is the answer. — Oct 19 '18 at 15:52

Yvonne AburrowYvonne Aburrow 1,678823 · Accepted Answer · 2018-11-20 19:52:36Z

1

+50

According to the answer to this question, you can format cells in Excel using openpyxl.

The answer given there only changes the target cell to bold, but maybe you can change the font face back to lemons1.

from openpyxl.workbook import Workbook

from openpyxl.styles import Font

wb = Workbook()

ws = wb.active

ws['B3'] = "Hello"

ws['B3'].font =  Font(name='lemons1', size=14)

wb.save("FontDemo.xlsx")

However, according to the documentation, you can only apply styles to whole cells, not to part of a cell. So you would need to put the Khmer characters in one cell and the English characters in another cell.

answered Nov 20 '18 at 19:52

Yvonne Aburrow

1,678823

Beside would you like to recommend anything I can try as an alternative solution, something like another package, library that I can achieve above result ? Really appreciated thanks

– Houy Narun
Nov 21 '18 at 0:40

sorry, I had never even heard of this package till I saw your post. You might like to investigate the APIs for interacting with Google Spreadsheets?

– Yvonne Aburrow
Nov 21 '18 at 15:39

add a comment |

andsands 739516 · Accepted Answer · 2018-11-24 00:21:07Z

Excel file (with .xlsx extension) is actually zip archive. (You can actually open excel file with 7-zip or some similar program.) So excel file contains bunch of xml files with data stored in them. What openpyxl does, is reading data from these xml files when opening excel file and creating zip archive with xml files when saving excel file. Simply sad, openpyxl reads some xml files, then it parses that data, than you can use funtions in openpyxl library to change and add data, and finally when you save your workbook, openpyxl will create xml files, write data to them, and save them as zip archive (which is excel file). These xml files contain all data stored in excel file, (one xml file contains formulas from excel file, other will contain styles, in other there will be data about excel theme and so on). We only care about strings in excel file which are stored in two xml file:

sharedStrings.xml

This file contains all strings in excel file and formatting of those strings, here is an example:

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>

<sst xmlns="http://schemas.openxmlformats.org/spreadsheetml/2006/main" count="2" uniqueCount="2">

    <si>

        <r>

            <rPr>

                <b/>

                <sz val="22"/>

                <color theme="1"/>

                <rFont val="Calibri"/>

                <family val="2"/>

                <scheme val="minor"/>

            </rPr>

            <t>Hello</t>

        </r>

        <r>

            <rPr>

                <sz val="22"/>

                <color theme="1"/>

                <rFont val="Calibri"/>

                <family val="2"/>

                <scheme val="minor"/>

            </rPr>

            <t xml:space="preserve"> ត</t>

        </r>

    </si>

    <si>

        <r>

            <t xml:space="preserve">ណ </t>

        </r>

        <r>

            <rPr>

                <i/>

                <sz val="24"/>

                <color theme="1"/>

                <rFont val="Calibri"/>

                <family val="2"/>

                <scheme val="minor"/>

            </rPr>

            <t>sike</t>

        </r>

    </si>

</sst>

sheet1.xml

This file contains position of your strings (which cell contains which string). (There will be one file for each sheet in you excel file, but let's say you have only one sheet in your file for purpose of this example.) Here is an example:

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>

<worksheet xmlns="http://schemas.openxmlformats.org/spreadsheetml/2006/main" xmlns:r="http://schemas.openxmlformats.org/officeDocument/2006/relationships" xmlns:mc="http://schemas.openxmlformats.org/markup-compatibility/2006" mc:Ignorable="x14ac xr xr2 xr3" xmlns:x14ac="http://schemas.microsoft.com/office/spreadsheetml/2009/9/ac" xmlns:xr="http://schemas.microsoft.com/office/spreadsheetml/2014/revision" xmlns:xr2="http://schemas.microsoft.com/office/spreadsheetml/2015/revision2" xmlns:xr3="http://schemas.microsoft.com/office/spreadsheetml/2016/revision3" xr:uid="{00000000-0001-0000-0000-000000000000}">

    <dimension ref="A1:C3"/>

    <sheetViews>

        <sheetView tabSelected="1" zoomScaleNormal="100" workbookViewId="0">

            <selection activeCell="A3" sqref="A3"/>

        </sheetView>

    </sheetViews>

    <sheetFormatPr defaultRowHeight="15" x14ac:dyDescent="0.25"/>

    <cols>

        <col min="1" max="1" width="20.140625" customWidth="1"/>

        <col min="2" max="2" width="10.7109375" customWidth="1"/>

    </cols>

    <sheetData>

        <row r="1" spans="1:3" ht="60.75" customHeight="1" x14ac:dyDescent="0.45">

            <c r="A1" s="4" t="s">

                <v>0</v>

            </c>

        </row>

        <row r="2" spans="1:3" ht="19.5" customHeight="1" x14ac:dyDescent="0.35">

            <c r="A2" s="1"/>

            <c r="B2" s="3"/>

        </row>

        <row r="3" spans="1:3" ht="62.25" customHeight="1" x14ac:dyDescent="0.5">

            <c r="A3" s="5" t="s">

                <v>1</v>

            </c>

            <c r="C3" s="2"/>

        </row>

    </sheetData>

    <pageMargins left="0.75" right="0.75" top="1" bottom="1" header="0.5" footer="0.5"/>

    <pageSetup paperSize="9" orientation="portrait" r:id="rId1"/>

</worksheet>

IF you open this excel with openpyxl and than save it (without changing any data), this is what sharedStrings.xml will look like:

<sst xmlns="http://schemas.openxmlformats.org/spreadsheetml/2006/main" uniqueCount="2">

    <si>

        <t>Hello ត</t>

    </si>

    <si>

        <t>ណ sike</t>

    </si>

</sst>

As you can see you will lose all cell (strings) original formatting and instead you will get some sort of merged formatting for your cells (So if some characters in cell are bold and some are not, then when you save file, either whole cell will be bold or whole cell will be normal). Now people had asked developers to implement this rich text option (link1, link2) but they sad that it would be complicated to implement something like this. I agree that this wouldn't be easy to do, but we can do something simpler: We can get data from sharedStrings.xml when we are opening excel file, and than use that xml code when we want to save excel file but only for cells that existed when we were opening file. This is probably not easy to understand so lets see following example:

Let's say you have excel file like this:
enter image description here

For this excel file, sharedStrings.xml will be this:

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>

<sst xmlns="http://schemas.openxmlformats.org/spreadsheetml/2006/main" count="1" uniqueCount="1">

    <si>

        <r>

            <rPr>

                <b/>

                <sz val="22"/>

                <color theme="1"/>

                <rFont val="Calibri"/>

                <family val="2"/>

                <scheme val="minor"/>

            </rPr>

            <t>Hello</t>

        </r>

        <r>

            <rPr>

                <sz val="22"/>

                <color theme="1"/>

                <rFont val="Calibri"/>

                <family val="2"/>

                <scheme val="minor"/>

            </rPr>

            <t xml:space="preserve"> ត</t>

        </r>

    </si>

</sst>

If you run this python code:

from openpyxl import load_workbook

workbook = load_workbook(FILENAME1, keep_links=False)

sheet = workbook.active

sheet['A2'] = 'It is me'

workbook.save('out.xlsx')

File out.xlsx will look like this:

enter image description here

For out.xlsx file, sharedStrings.xml will be this:

<sst xmlns="http://schemas.openxmlformats.org/spreadsheetml/2006/main" uniqueCount="2">

    <si>

        <t>Hello ត</t>

    </si>

    <si>

        <t>It is me</t>

    </si>

</sst>

So what we want to do is use this xml code:

<si>

    <r>

        <rPr>

            <b/>

            <sz val="22"/>

            <color theme="1"/>

            <rFont val="Calibri"/>

            <family val="2"/>

            <scheme val="minor"/>

        </rPr>

        <t>Hello</t>

    </r>

    <r>

        <rPr>

            <sz val="22"/>

            <color theme="1"/>

            <rFont val="Calibri"/>

            <family val="2"/>

            <scheme val="minor"/>

        </rPr>

        <t xml:space="preserve"> ត</t>

    </r>

</si>

for old cell A1 which contains Hello ត and this xml code:

<si>

    <t>It is me</t>

</si>

for new cell A2 which contains It is me.

So we can combine this xml parts to get xml file like this:

<sst xmlns="http://schemas.openxmlformats.org/spreadsheetml/2006/main" uniqueCount="2">

    <si>

        <r>

            <rPr>

                <b/>

                <sz val="22"/>

                <color theme="1"/>

                <rFont val="Calibri"/>

                <family val="2"/>

                <scheme val="minor"/>

            </rPr>

            <t>Hello</t>

        </r>

        <r>

            <rPr>

                <sz val="22"/>

                <color theme="1"/>

                <rFont val="Calibri"/>

                <family val="2"/>

                <scheme val="minor"/>

            </rPr>

            <t xml:space="preserve"> ត</t>

        </r>

    </si>

    <si>

        <t>It is me</t>

    </si>

</sst>

I wrote some functions to be able to this. (There is quite a lot of code, but most of it is just copied from openpyxl. If you would change openpyxl library you could do this in 10 or 20 lines of code, but thats never good idea so I rather copied whole functions that i needed to change and then changed that small part.)

You can save following code in separate file extendedopenpyxl.py:

from openpyxl import load_workbook as openpyxlload_workbook

from openpyxl.reader.excel import _validate_archive, _find_workbook_part

from openpyxl.reader.worksheet import _get_xml_iter

from openpyxl.xml.functions import fromstring, iterparse, safe_iterator, tostring, Element, xmlfile, SubElement

from openpyxl.xml.constants import ARC_CONTENT_TYPES, SHEET_MAIN_NS, SHARED_STRINGS, ARC_ROOT_RELS, ARC_APP, ARC_CORE, ARC_THEME, ARC_SHARED_STRINGS, ARC_STYLE, ARC_WORKBOOK, ARC_WORKBOOK_RELS

from openpyxl.packaging.manifest import Manifest

from openpyxl.packaging.relationship import get_dependents, get_rels_path

from openpyxl.packaging.workbook import WorkbookParser

from openpyxl.packaging.extended import ExtendedProperties

from openpyxl.utils import coordinate_to_tuple

from openpyxl.cell.text import Text

from openpyxl.writer.excel import ExcelWriter as openpyxlExcelWriter

from openpyxl.writer.workbook import write_root_rels, write_workbook_rels, write_workbook

from openpyxl.writer.theme import write_theme

from openpyxl.writer.etree_worksheet import get_rows_to_write

from openpyxl.styles.stylesheet import write_stylesheet

from zipfile import ZipFile, ZIP_DEFLATED

from operator import itemgetter

from io import BytesIO

from xml.etree.ElementTree import tostring as xml_tostring

from xml.etree.ElementTree import register_namespace

from lxml.etree import fromstring as lxml_fromstring



register_namespace('', 'http://schemas.openxmlformats.org/spreadsheetml/2006/main')



def get_value_cells(workbook):

    value_cells = 

    for idx, worksheet in enumerate(workbook.worksheets, 1):

        all_rows = get_rows_to_write(worksheet)

        for row_idx, row in all_rows:

            row = sorted(row, key=itemgetter(0))

            for col, cell in row:

                if cell._value is not None:

                    if cell.data_type == 's':

                        value_cells.append((worksheet.title,(cell.row, cell.col_idx)))

    return value_cells



def check_if_lxml(element):

    if type(element).__module__ == 'xml.etree.ElementTree':

        string = xml_tostring(element)

        el = lxml_fromstring(string)

        return el

    return element



def write_string_table(workbook):

    string_table = workbook.shared_strings

    workbook_data = workbook.new_interal_value_workbook_data

    data_strings = workbook.new_interal_value_data_strings

    value_cells = get_value_cells(workbook)

    out = BytesIO()

    i = 0

    with xmlfile(out) as xf:

        with xf.element("sst", xmlns=SHEET_MAIN_NS, uniqueCount="%d" % len(string_table)):            

            for i, key in enumerate(string_table):

                sheetname, coordinates = value_cells[i]

                if coordinates in workbook_data[sheetname]:

                    value = workbook_data[sheetname][coordinates]

                    xml_el = data_strings[value]

                    el = check_if_lxml(xml_el)

                else:

                    el = Element('si')

                    text = SubElement(el, 't')

                    text.text = key

                    if key.strip() != key:

                        text.set(PRESERVE_SPACE, 'preserve')

                xf.write(el)



    return  out.getvalue()





class ExcelWriter(openpyxlExcelWriter):



    def write_data(self):

        """Write the various xml files into the zip archive."""

        # cleanup all worksheets

        archive = self._archive



        archive.writestr(ARC_ROOT_RELS, write_root_rels(self.workbook))

        props = ExtendedProperties()

        archive.writestr(ARC_APP, tostring(props.to_tree()))



        archive.writestr(ARC_CORE, tostring(self.workbook.properties.to_tree()))

        if self.workbook.loaded_theme:

            archive.writestr(ARC_THEME, self.workbook.loaded_theme)

        else:

            archive.writestr(ARC_THEME, write_theme())



        self._write_worksheets()

        self._write_chartsheets()

        self._write_images()

        self._write_charts()



        string_table_out = write_string_table(self.workbook)

        self._archive.writestr(ARC_SHARED_STRINGS, string_table_out)

        self._write_external_links()



        stylesheet = write_stylesheet(self.workbook)

        archive.writestr(ARC_STYLE, tostring(stylesheet))



        archive.writestr(ARC_WORKBOOK, write_workbook(self.workbook))

        archive.writestr(ARC_WORKBOOK_RELS, write_workbook_rels(self.workbook))



        self._merge_vba()



        self.manifest._write(archive, self.workbook)



        return



    def save(self, filename):

        self.write_data()

        self._archive.close()

        return





def get_coordinates(cell, row_count, col_count):

    coordinate = cell.get('r')

    if coordinate:

        row, column = coordinate_to_tuple(coordinate)

    else:

        row, column = row_count, col_count

    return row, column



def parse_cell(cell):

    VALUE_TAG = '{%s}v' % SHEET_MAIN_NS

    value = cell.find(VALUE_TAG)

    if value is not None:

        value = int(value.text)

    return value



def parse_row(row, row_count):

    CELL_TAG = '{%s}c' % SHEET_MAIN_NS

    if row.get('r'):

        row_count = int(row.get('r'))

    else:

        row_count += 1

    col_count = 0

    data = dict()

    for cell in safe_iterator(row, CELL_TAG):

        col_count += 1

        value = parse_cell(cell)

        if value is not None:

            coordinates = get_coordinates(cell, row_count, col_count)

            data[coordinates] = value

    return data





def parse_sheet(xml_source):

    dispatcher = ['{%s}mergeCells' % SHEET_MAIN_NS, '{%s}col' % SHEET_MAIN_NS, '{%s}row' % SHEET_MAIN_NS, '{%s}conditionalFormatting' % SHEET_MAIN_NS, '{%s}legacyDrawing' % SHEET_MAIN_NS, '{%s}sheetProtection' % SHEET_MAIN_NS, '{%s}extLst' % SHEET_MAIN_NS, '{%s}hyperlink' % SHEET_MAIN_NS, '{%s}tableParts' % SHEET_MAIN_NS]

    row_count = 0

    stream = _get_xml_iter(xml_source)

    it = iterparse(stream, tag=dispatcher)

    row_tag = '{%s}row' % SHEET_MAIN_NS

    data = dict()

    for _, element in it:

        tag_name = element.tag

        if tag_name == row_tag:

            row_data = parse_row(element, row_count)

            data.update(row_data)

            element.clear()

    return data





def get_workbook_parser(archive):

    src = archive.read(ARC_CONTENT_TYPES)

    root = fromstring(src)

    package = Manifest.from_tree(root)

    wb_part = _find_workbook_part(package)

    workbook_part_name = wb_part.PartName[1:]

    parser = WorkbookParser(archive, workbook_part_name)

    parser.parse()

    return parser, package



def get_data_strings(xml_source):

    STRING_TAG = '{%s}si' % SHEET_MAIN_NS

    strings = 

    src = _get_xml_iter(xml_source)

    for _, node in iterparse(src):

        if node.tag == STRING_TAG:

            strings.append(node)



    return strings



def load_workbook(filename, *args, **kwargs):

    workbook = openpyxlload_workbook(filename, *args, **kwargs)



    archive = _validate_archive(filename)

    parser, package = get_workbook_parser(archive)



    workbook_data = dict()

    for sheet, rel in parser.find_sheets():

        sheet_name = sheet.name

        worksheet_path = rel.target

        fh = archive.open(worksheet_path)

        sheet_data = parse_sheet(fh)

        workbook_data[sheet_name] = sheet_data



    data_strings = 

    ct = package.find(SHARED_STRINGS)

    if ct is not None:

        strings_path = ct.PartName[1:]

        strings_source = archive.read(strings_path)

        data_strings = get_data_strings(strings_source)



    workbook.new_interal_value_workbook_data = workbook_data

    workbook.new_interal_value_data_strings = data_strings

    return workbook



def save_workbook(workbook, filename,):

    archive = ZipFile(filename, 'w', ZIP_DEFLATED, allowZip64=True)

    writer = ExcelWriter(workbook, archive)

    writer.save(filename)

    return True



def save_virtual_workbook(workbook,):

    temp_buffer = BytesIO()

    archive = ZipFile(temp_buffer, 'w', ZIP_DEFLATED, allowZip64=True)

    writer = ExcelWriter(workbook, archive)

    try:

        writer.write_data()

    finally:

        archive.close()

    virtual_workbook = temp_buffer.getvalue()

    temp_buffer.close()

    return virtual_workbook

And now if your run this code:

from extendedopenpyxl import load_workbook, save_workbook



workbook = load_workbook(FILENAME1, keep_links=False)

sheet = workbook['Sheet']

sheet['A2'] = 'It is me'

save_workbook(workbook, 'out.xlsx')

When I run this code on excel file that I used in example above I got this result:

enter image description here

As you can see text in cell A1 is formatted as it was, (Hello is in bold, and ត is not).

Yvonne AburrowYvonne Aburrow 1,678823 · Accepted Answer · 2018-11-20 19:52:36Z

1

+50

According to the answer to this question, you can format cells in Excel using openpyxl.

The answer given there only changes the target cell to bold, but maybe you can change the font face back to lemons1.

from openpyxl.workbook import Workbook

from openpyxl.styles import Font

wb = Workbook()

ws = wb.active

ws['B3'] = "Hello"

ws['B3'].font =  Font(name='lemons1', size=14)

wb.save("FontDemo.xlsx")

However, according to the documentation, you can only apply styles to whole cells, not to part of a cell. So you would need to put the Khmer characters in one cell and the English characters in another cell.

answered Nov 20 '18 at 19:52

Yvonne Aburrow

1,678823

Beside would you like to recommend anything I can try as an alternative solution, something like another package, library that I can achieve above result ? Really appreciated thanks

– Houy Narun
Nov 21 '18 at 0:40

sorry, I had never even heard of this package till I saw your post. You might like to investigate the APIs for interacting with Google Spreadsheets?

– Yvonne Aburrow
Nov 21 '18 at 15:39

add a comment |

andsands 739516 · Accepted Answer · 2018-11-24 00:21:07Z

Excel file (with .xlsx extension) is actually zip archive. (You can actually open excel file with 7-zip or some similar program.) So excel file contains bunch of xml files with data stored in them. What openpyxl does, is reading data from these xml files when opening excel file and creating zip archive with xml files when saving excel file. Simply sad, openpyxl reads some xml files, then it parses that data, than you can use funtions in openpyxl library to change and add data, and finally when you save your workbook, openpyxl will create xml files, write data to them, and save them as zip archive (which is excel file). These xml files contain all data stored in excel file, (one xml file contains formulas from excel file, other will contain styles, in other there will be data about excel theme and so on). We only care about strings in excel file which are stored in two xml file:

sharedStrings.xml

This file contains all strings in excel file and formatting of those strings, here is an example:

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>

<sst xmlns="http://schemas.openxmlformats.org/spreadsheetml/2006/main" count="2" uniqueCount="2">

    <si>

        <r>

            <rPr>

                <b/>

                <sz val="22"/>

                <color theme="1"/>

                <rFont val="Calibri"/>

                <family val="2"/>

                <scheme val="minor"/>

            </rPr>

            <t>Hello</t>

        </r>

        <r>

            <rPr>

                <sz val="22"/>

                <color theme="1"/>

                <rFont val="Calibri"/>

                <family val="2"/>

                <scheme val="minor"/>

            </rPr>

            <t xml:space="preserve"> ត</t>

        </r>

    </si>

    <si>

        <r>

            <t xml:space="preserve">ណ </t>

        </r>

        <r>

            <rPr>

                <i/>

                <sz val="24"/>

                <color theme="1"/>

                <rFont val="Calibri"/>

                <family val="2"/>

                <scheme val="minor"/>

            </rPr>

            <t>sike</t>

        </r>

    </si>

</sst>

sheet1.xml

This file contains position of your strings (which cell contains which string). (There will be one file for each sheet in you excel file, but let's say you have only one sheet in your file for purpose of this example.) Here is an example:

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>

<worksheet xmlns="http://schemas.openxmlformats.org/spreadsheetml/2006/main" xmlns:r="http://schemas.openxmlformats.org/officeDocument/2006/relationships" xmlns:mc="http://schemas.openxmlformats.org/markup-compatibility/2006" mc:Ignorable="x14ac xr xr2 xr3" xmlns:x14ac="http://schemas.microsoft.com/office/spreadsheetml/2009/9/ac" xmlns:xr="http://schemas.microsoft.com/office/spreadsheetml/2014/revision" xmlns:xr2="http://schemas.microsoft.com/office/spreadsheetml/2015/revision2" xmlns:xr3="http://schemas.microsoft.com/office/spreadsheetml/2016/revision3" xr:uid="{00000000-0001-0000-0000-000000000000}">

    <dimension ref="A1:C3"/>

    <sheetViews>

        <sheetView tabSelected="1" zoomScaleNormal="100" workbookViewId="0">

            <selection activeCell="A3" sqref="A3"/>

        </sheetView>

    </sheetViews>

    <sheetFormatPr defaultRowHeight="15" x14ac:dyDescent="0.25"/>

    <cols>

        <col min="1" max="1" width="20.140625" customWidth="1"/>

        <col min="2" max="2" width="10.7109375" customWidth="1"/>

    </cols>

    <sheetData>

        <row r="1" spans="1:3" ht="60.75" customHeight="1" x14ac:dyDescent="0.45">

            <c r="A1" s="4" t="s">

                <v>0</v>

            </c>

        </row>

        <row r="2" spans="1:3" ht="19.5" customHeight="1" x14ac:dyDescent="0.35">

            <c r="A2" s="1"/>

            <c r="B2" s="3"/>

        </row>

        <row r="3" spans="1:3" ht="62.25" customHeight="1" x14ac:dyDescent="0.5">

            <c r="A3" s="5" t="s">

                <v>1</v>

            </c>

            <c r="C3" s="2"/>

        </row>

    </sheetData>

    <pageMargins left="0.75" right="0.75" top="1" bottom="1" header="0.5" footer="0.5"/>

    <pageSetup paperSize="9" orientation="portrait" r:id="rId1"/>

</worksheet>

IF you open this excel with openpyxl and than save it (without changing any data), this is what sharedStrings.xml will look like:

<sst xmlns="http://schemas.openxmlformats.org/spreadsheetml/2006/main" uniqueCount="2">

    <si>

        <t>Hello ត</t>

    </si>

    <si>

        <t>ណ sike</t>

    </si>

</sst>

As you can see you will lose all cell (strings) original formatting and instead you will get some sort of merged formatting for your cells (So if some characters in cell are bold and some are not, then when you save file, either whole cell will be bold or whole cell will be normal). Now people had asked developers to implement this rich text option (link1, link2) but they sad that it would be complicated to implement something like this. I agree that this wouldn't be easy to do, but we can do something simpler: We can get data from sharedStrings.xml when we are opening excel file, and than use that xml code when we want to save excel file but only for cells that existed when we were opening file. This is probably not easy to understand so lets see following example:

Let's say you have excel file like this:
enter image description here

For this excel file, sharedStrings.xml will be this:

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>

<sst xmlns="http://schemas.openxmlformats.org/spreadsheetml/2006/main" count="1" uniqueCount="1">

    <si>

        <r>

            <rPr>

                <b/>

                <sz val="22"/>

                <color theme="1"/>

                <rFont val="Calibri"/>

                <family val="2"/>

                <scheme val="minor"/>

            </rPr>

            <t>Hello</t>

        </r>

        <r>

            <rPr>

                <sz val="22"/>

                <color theme="1"/>

                <rFont val="Calibri"/>

                <family val="2"/>

                <scheme val="minor"/>

            </rPr>

            <t xml:space="preserve"> ត</t>

        </r>

    </si>

</sst>

If you run this python code:

from openpyxl import load_workbook

workbook = load_workbook(FILENAME1, keep_links=False)

sheet = workbook.active

sheet['A2'] = 'It is me'

workbook.save('out.xlsx')

File out.xlsx will look like this:

enter image description here

For out.xlsx file, sharedStrings.xml will be this:

<sst xmlns="http://schemas.openxmlformats.org/spreadsheetml/2006/main" uniqueCount="2">

    <si>

        <t>Hello ត</t>

    </si>

    <si>

        <t>It is me</t>

    </si>

</sst>

So what we want to do is use this xml code:

<si>

    <r>

        <rPr>

            <b/>

            <sz val="22"/>

            <color theme="1"/>

            <rFont val="Calibri"/>

            <family val="2"/>

            <scheme val="minor"/>

        </rPr>

        <t>Hello</t>

    </r>

    <r>

        <rPr>

            <sz val="22"/>

            <color theme="1"/>

            <rFont val="Calibri"/>

            <family val="2"/>

            <scheme val="minor"/>

        </rPr>

        <t xml:space="preserve"> ត</t>

    </r>

</si>

for old cell A1 which contains Hello ត and this xml code:

<si>

    <t>It is me</t>

</si>

for new cell A2 which contains It is me.

So we can combine this xml parts to get xml file like this:

<sst xmlns="http://schemas.openxmlformats.org/spreadsheetml/2006/main" uniqueCount="2">

    <si>

        <r>

            <rPr>

                <b/>

                <sz val="22"/>

                <color theme="1"/>

                <rFont val="Calibri"/>

                <family val="2"/>

                <scheme val="minor"/>

            </rPr>

            <t>Hello</t>

        </r>

        <r>

            <rPr>

                <sz val="22"/>

                <color theme="1"/>

                <rFont val="Calibri"/>

                <family val="2"/>

                <scheme val="minor"/>

            </rPr>

            <t xml:space="preserve"> ត</t>

        </r>

    </si>

    <si>

        <t>It is me</t>

    </si>

</sst>

I wrote some functions to be able to this. (There is quite a lot of code, but most of it is just copied from openpyxl. If you would change openpyxl library you could do this in 10 or 20 lines of code, but thats never good idea so I rather copied whole functions that i needed to change and then changed that small part.)

You can save following code in separate file extendedopenpyxl.py:

from openpyxl import load_workbook as openpyxlload_workbook

from openpyxl.reader.excel import _validate_archive, _find_workbook_part

from openpyxl.reader.worksheet import _get_xml_iter

from openpyxl.xml.functions import fromstring, iterparse, safe_iterator, tostring, Element, xmlfile, SubElement

from openpyxl.xml.constants import ARC_CONTENT_TYPES, SHEET_MAIN_NS, SHARED_STRINGS, ARC_ROOT_RELS, ARC_APP, ARC_CORE, ARC_THEME, ARC_SHARED_STRINGS, ARC_STYLE, ARC_WORKBOOK, ARC_WORKBOOK_RELS

from openpyxl.packaging.manifest import Manifest

from openpyxl.packaging.relationship import get_dependents, get_rels_path

from openpyxl.packaging.workbook import WorkbookParser

from openpyxl.packaging.extended import ExtendedProperties

from openpyxl.utils import coordinate_to_tuple

from openpyxl.cell.text import Text

from openpyxl.writer.excel import ExcelWriter as openpyxlExcelWriter

from openpyxl.writer.workbook import write_root_rels, write_workbook_rels, write_workbook

from openpyxl.writer.theme import write_theme

from openpyxl.writer.etree_worksheet import get_rows_to_write

from openpyxl.styles.stylesheet import write_stylesheet

from zipfile import ZipFile, ZIP_DEFLATED

from operator import itemgetter

from io import BytesIO

from xml.etree.ElementTree import tostring as xml_tostring

from xml.etree.ElementTree import register_namespace

from lxml.etree import fromstring as lxml_fromstring



register_namespace('', 'http://schemas.openxmlformats.org/spreadsheetml/2006/main')



def get_value_cells(workbook):

    value_cells = 

    for idx, worksheet in enumerate(workbook.worksheets, 1):

        all_rows = get_rows_to_write(worksheet)

        for row_idx, row in all_rows:

            row = sorted(row, key=itemgetter(0))

            for col, cell in row:

                if cell._value is not None:

                    if cell.data_type == 's':

                        value_cells.append((worksheet.title,(cell.row, cell.col_idx)))

    return value_cells



def check_if_lxml(element):

    if type(element).__module__ == 'xml.etree.ElementTree':

        string = xml_tostring(element)

        el = lxml_fromstring(string)

        return el

    return element



def write_string_table(workbook):

    string_table = workbook.shared_strings

    workbook_data = workbook.new_interal_value_workbook_data

    data_strings = workbook.new_interal_value_data_strings

    value_cells = get_value_cells(workbook)

    out = BytesIO()

    i = 0

    with xmlfile(out) as xf:

        with xf.element("sst", xmlns=SHEET_MAIN_NS, uniqueCount="%d" % len(string_table)):            

            for i, key in enumerate(string_table):

                sheetname, coordinates = value_cells[i]

                if coordinates in workbook_data[sheetname]:

                    value = workbook_data[sheetname][coordinates]

                    xml_el = data_strings[value]

                    el = check_if_lxml(xml_el)

                else:

                    el = Element('si')

                    text = SubElement(el, 't')

                    text.text = key

                    if key.strip() != key:

                        text.set(PRESERVE_SPACE, 'preserve')

                xf.write(el)



    return  out.getvalue()





class ExcelWriter(openpyxlExcelWriter):



    def write_data(self):

        """Write the various xml files into the zip archive."""

        # cleanup all worksheets

        archive = self._archive



        archive.writestr(ARC_ROOT_RELS, write_root_rels(self.workbook))

        props = ExtendedProperties()

        archive.writestr(ARC_APP, tostring(props.to_tree()))



        archive.writestr(ARC_CORE, tostring(self.workbook.properties.to_tree()))

        if self.workbook.loaded_theme:

            archive.writestr(ARC_THEME, self.workbook.loaded_theme)

        else:

            archive.writestr(ARC_THEME, write_theme())



        self._write_worksheets()

        self._write_chartsheets()

        self._write_images()

        self._write_charts()



        string_table_out = write_string_table(self.workbook)

        self._archive.writestr(ARC_SHARED_STRINGS, string_table_out)

        self._write_external_links()



        stylesheet = write_stylesheet(self.workbook)

        archive.writestr(ARC_STYLE, tostring(stylesheet))



        archive.writestr(ARC_WORKBOOK, write_workbook(self.workbook))

        archive.writestr(ARC_WORKBOOK_RELS, write_workbook_rels(self.workbook))



        self._merge_vba()



        self.manifest._write(archive, self.workbook)



        return



    def save(self, filename):

        self.write_data()

        self._archive.close()

        return





def get_coordinates(cell, row_count, col_count):

    coordinate = cell.get('r')

    if coordinate:

        row, column = coordinate_to_tuple(coordinate)

    else:

        row, column = row_count, col_count

    return row, column



def parse_cell(cell):

    VALUE_TAG = '{%s}v' % SHEET_MAIN_NS

    value = cell.find(VALUE_TAG)

    if value is not None:

        value = int(value.text)

    return value



def parse_row(row, row_count):

    CELL_TAG = '{%s}c' % SHEET_MAIN_NS

    if row.get('r'):

        row_count = int(row.get('r'))

    else:

        row_count += 1

    col_count = 0

    data = dict()

    for cell in safe_iterator(row, CELL_TAG):

        col_count += 1

        value = parse_cell(cell)

        if value is not None:

            coordinates = get_coordinates(cell, row_count, col_count)

            data[coordinates] = value

    return data





def parse_sheet(xml_source):

    dispatcher = ['{%s}mergeCells' % SHEET_MAIN_NS, '{%s}col' % SHEET_MAIN_NS, '{%s}row' % SHEET_MAIN_NS, '{%s}conditionalFormatting' % SHEET_MAIN_NS, '{%s}legacyDrawing' % SHEET_MAIN_NS, '{%s}sheetProtection' % SHEET_MAIN_NS, '{%s}extLst' % SHEET_MAIN_NS, '{%s}hyperlink' % SHEET_MAIN_NS, '{%s}tableParts' % SHEET_MAIN_NS]

    row_count = 0

    stream = _get_xml_iter(xml_source)

    it = iterparse(stream, tag=dispatcher)

    row_tag = '{%s}row' % SHEET_MAIN_NS

    data = dict()

    for _, element in it:

        tag_name = element.tag

        if tag_name == row_tag:

            row_data = parse_row(element, row_count)

            data.update(row_data)

            element.clear()

    return data





def get_workbook_parser(archive):

    src = archive.read(ARC_CONTENT_TYPES)

    root = fromstring(src)

    package = Manifest.from_tree(root)

    wb_part = _find_workbook_part(package)

    workbook_part_name = wb_part.PartName[1:]

    parser = WorkbookParser(archive, workbook_part_name)

    parser.parse()

    return parser, package



def get_data_strings(xml_source):

    STRING_TAG = '{%s}si' % SHEET_MAIN_NS

    strings = 

    src = _get_xml_iter(xml_source)

    for _, node in iterparse(src):

        if node.tag == STRING_TAG:

            strings.append(node)



    return strings



def load_workbook(filename, *args, **kwargs):

    workbook = openpyxlload_workbook(filename, *args, **kwargs)



    archive = _validate_archive(filename)

    parser, package = get_workbook_parser(archive)



    workbook_data = dict()

    for sheet, rel in parser.find_sheets():

        sheet_name = sheet.name

        worksheet_path = rel.target

        fh = archive.open(worksheet_path)

        sheet_data = parse_sheet(fh)

        workbook_data[sheet_name] = sheet_data



    data_strings = 

    ct = package.find(SHARED_STRINGS)

    if ct is not None:

        strings_path = ct.PartName[1:]

        strings_source = archive.read(strings_path)

        data_strings = get_data_strings(strings_source)



    workbook.new_interal_value_workbook_data = workbook_data

    workbook.new_interal_value_data_strings = data_strings

    return workbook



def save_workbook(workbook, filename,):

    archive = ZipFile(filename, 'w', ZIP_DEFLATED, allowZip64=True)

    writer = ExcelWriter(workbook, archive)

    writer.save(filename)

    return True



def save_virtual_workbook(workbook,):

    temp_buffer = BytesIO()

    archive = ZipFile(temp_buffer, 'w', ZIP_DEFLATED, allowZip64=True)

    writer = ExcelWriter(workbook, archive)

    try:

        writer.write_data()

    finally:

        archive.close()

    virtual_workbook = temp_buffer.getvalue()

    temp_buffer.close()

    return virtual_workbook

And now if your run this code:

from extendedopenpyxl import load_workbook, save_workbook



workbook = load_workbook(FILENAME1, keep_links=False)

sheet = workbook['Sheet']

sheet['A2'] = 'It is me'

save_workbook(workbook, 'out.xlsx')

When I run this code on excel file that I used in example above I got this result:

enter image description here

As you can see text in cell A1 is formatted as it was, (Hello is in bold, and ត is not).

搜尋此網誌

Tukukkk

How to keep style format unchanged after writing data using OpenPyXL package in Python?

2 Answers
2

Your Answer

Post as a guest

2 Answers
2

2 Answers
2

Post as a guest

Popular posts from this blog

404 Error Contact Form 7 ajax form submitting

How to know if a Active Directory user can login interactively

C# WPF - Problem with Material Design Textbox

How to keep style format unchanged after writing data using OpenPyXL package in Python?

2 Answers 2

Your Answer

Sign up or log in

Post as a guest

Post as a guest

2 Answers 2

2 Answers 2

Sign up or log in

Post as a guest

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Popular posts from this blog

404 Error Contact Form 7 ajax form submitting

How to know if a Active Directory user can login interactively

C# WPF - Problem with Material Design Textbox

2 Answers
2

2 Answers
2

2 Answers
2