jeudi 5 février 2015

Improve performance of converting a large xml file (~300 MB) to relational table in SQL Server


So this is what I have so far:



--Read xml content into a XML data type variable
DECLARE @FileData XML
SELECT @FileData = CONVERT(XML, BulkColumn)
FROM OPENROWSET(BULK '\\file_path\test.xml', SINGLE_BLOB) AS x
--Read from the XML variable to create Entity-Attribute-Value table
SELECT N1.Id.value('@Id', 'varchar(50)') as Id
, N1.Id.value('@Name', 'varchar(100)') as Name
, N2.AttributeLongName.value('@AttributeName', 'varchar(100)') as AttributeName
, N3.AttributeValue.value('.', 'varchar(MAX)') as AttributeValue
FROM @FileData.nodes('/Data/Entities/Entity') as N1(Id) ---1st lvl Node contains the Entity
cross apply Id.nodes('Attributes/Attribute') as N2(AttributeName) --2nd lvl Node contains AttributeName
cross apply AttributeName.nodes('Values/Value') as N3(AttributeValue) --3rd lvl Node contains AttributeValue


This code performs decently when run against an 8 MB XML file (~15s). However, when I try to run it against a 300 MB XML file of the same structure, it takes hours.


Does this mean that the XQuery nodes() method does not scale linearly or scales worst?


Is there any other way I can improve the performance on this?





Aucun commentaire:

Enregistrer un commentaire