{"id":7083,"date":"2022-03-02T07:20:06","date_gmt":"2022-03-02T07:20:06","guid":{"rendered":"http:\/\/escience.sdu.dk\/?p=7083"},"modified":"2022-03-02T07:20:06","modified_gmt":"2022-03-02T07:20:06","slug":"detecting-text-reuse-in-h-c-andersens-work","status":"publish","type":"post","link":"http:\/\/escience.sdu.dk\/index.php\/2022\/03\/02\/detecting-text-reuse-in-h-c-andersens-work\/","title":{"rendered":"Detecting text reuse in H.C. Andersen&#8217;s work"},"content":{"rendered":"\n<p>PI:&nbsp;Nils Holger&nbsp;Berg,&nbsp;<a href=\"https:\/\/andersen.sdu.dk\/\">The Hans Christian Andersen Centre,<\/a>&nbsp;<a href=\"https:\/\/www.sdu.dk\/en\/om_sdu\/institutter_centre\/ikv\">Department for the Study of Culture<\/a><\/p>\n\n\n\n<p>The project\u00a0<em>Detecting text reuse in H.C. Andersen\u2019s work<\/em>\u00a0is part of a larger publication project,\u00a0<em>Hans Christian Andersen\u2019s Fairy Tales and Stories \u2013 the digital manuscript edition,\u00a0<\/em>led by Associate Professor Ane Grum-Schwensen,\u00a0which aims to digitalize and publish all the preserved manuscripts of Hans Christian Andersen\u00a0in an online, genetic edition (if you are interested in knowing more about the digital edition, it is described in more detail at\u00a0<a href=\"http:\/\/andersen.sdu.dk\/ms\">http:\/\/andersen.sdu.dk\/ms<\/a>. The updated description is in Danish, but an older English version is available at\u00a0<a href=\"http:\/\/beta.auh.sdu.dk\/en\/\">http:\/\/beta.auh.sdu.dk\/en\/<\/a>.)\u00a0<\/p>\n\n\n\n<p>In\u00a02019,\u00a0senior researcher\u00a0Ejnar Stig Askgaard from Odense City Museums began comparing\u00a0Hans\u00a0Christian Andersen\u2019s notes, written between approximately 1833 \u2013 1875, with the 162 fairy tales, novels and autobiographies. This\u00a0had\u00a0led to the discovery that Hans\u00a0Christian\u00a0Andersen liked to use symbols such as cross marks or deletions in his notes to indicate that the note had been reused in his fairytales.\u00a0<\/p>\n\n\n\n<p>For\u00a0<em>Detecting text reuse in H.C. Andersen\u2019s\u00a0work<\/em>,\u00a0Berg\u00a0wanted to find out where each note had been reused.\u00a0Earlier research had managed\u00a0to manually identify where 278 notes had been reused in Hans\u00a0Christian\u00a0Andersen\u2019s published work, but this had been a time-consuming effort, taking many months of work.<\/p>\n\n\n\n<p>As 861 of the notes had been digitalized in addition to Hans&nbsp;Christian&nbsp;Andersen\u2019s published work<em>,<\/em>&nbsp;Berg was able to apply digital methods to solve his problem. He contacted&nbsp;Zhiru Sun, Assistant Professor at the Department of Design and Communication at SDU,&nbsp;who used a&nbsp;method called Natural Language Processing to find similarities between the notes and Hans&nbsp;Christian&nbsp;Anderson\u2019s work.&nbsp;Using&nbsp;the Python application on&nbsp;<a rel=\"noreferrer noopener\" href=\"https:\/\/cloud.sdu.dk\/app\/dashboard\" target=\"_blank\">UCloud<\/a>, this method&nbsp;generated&nbsp;a number of tables, which indicated how similar a specific note is to a specific fairytale.<\/p>\n\n\n\n<blockquote class=\"wp-block-quote\"><p><em>\u201cIt only took me around 8 hours to generate these tables and find a good indication of where all the 861 digitalized notes had been reused,\u201d Sun&nbsp;explains.&nbsp;<\/em><\/p><p><\/p><\/blockquote>\n\n\n\n<figure class=\"wp-block-gallery columns-2 is-cropped\"><ul class=\"blocks-gallery-grid\"><li class=\"blocks-gallery-item\"><figure><img loading=\"lazy\" width=\"300\" height=\"161\" src=\"https:\/\/escience.sdu.dk\/wp-content\/uploads\/2022\/01\/Screen-Shot-2022-01-04-at-12.26.20-300x161.png\" alt=\"\" data-id=\"7069\" data-full-url=\"http:\/\/escience.sdu.dk\/wp-content\/uploads\/2022\/01\/Screen-Shot-2022-01-04-at-12.26.20.png\" data-link=\"http:\/\/escience.sdu.dk\/?attachment_id=7069\" class=\"wp-image-7069\" srcset=\"http:\/\/escience.sdu.dk\/wp-content\/uploads\/2022\/01\/Screen-Shot-2022-01-04-at-12.26.20-300x161.png 300w, http:\/\/escience.sdu.dk\/wp-content\/uploads\/2022\/01\/Screen-Shot-2022-01-04-at-12.26.20-1024x548.png 1024w, http:\/\/escience.sdu.dk\/wp-content\/uploads\/2022\/01\/Screen-Shot-2022-01-04-at-12.26.20-768x411.png 768w, http:\/\/escience.sdu.dk\/wp-content\/uploads\/2022\/01\/Screen-Shot-2022-01-04-at-12.26.20-1536x822.png 1536w, http:\/\/escience.sdu.dk\/wp-content\/uploads\/2022\/01\/Screen-Shot-2022-01-04-at-12.26.20.png 1540w\" sizes=\"(max-width: 300px) 100vw, 300px\" \/><\/figure><\/li><li class=\"blocks-gallery-item\"><figure><img loading=\"lazy\" width=\"300\" height=\"165\" src=\"https:\/\/escience.sdu.dk\/wp-content\/uploads\/2022\/01\/Screen-Shot-2022-01-31-at-08.46.39-300x165.png\" alt=\"\" data-id=\"7090\" data-full-url=\"http:\/\/escience.sdu.dk\/wp-content\/uploads\/2022\/01\/Screen-Shot-2022-01-31-at-08.46.39.png\" data-link=\"http:\/\/escience.sdu.dk\/?attachment_id=7090\" class=\"wp-image-7090\" srcset=\"http:\/\/escience.sdu.dk\/wp-content\/uploads\/2022\/01\/Screen-Shot-2022-01-31-at-08.46.39-300x165.png 300w, http:\/\/escience.sdu.dk\/wp-content\/uploads\/2022\/01\/Screen-Shot-2022-01-31-at-08.46.39-1024x565.png 1024w, http:\/\/escience.sdu.dk\/wp-content\/uploads\/2022\/01\/Screen-Shot-2022-01-31-at-08.46.39-768x423.png 768w, http:\/\/escience.sdu.dk\/wp-content\/uploads\/2022\/01\/Screen-Shot-2022-01-31-at-08.46.39-1536x847.png 1536w, http:\/\/escience.sdu.dk\/wp-content\/uploads\/2022\/01\/Screen-Shot-2022-01-31-at-08.46.39.png 1988w\" sizes=\"(max-width: 300px) 100vw, 300px\" \/><\/figure><\/li><li class=\"blocks-gallery-item\"><figure><img loading=\"lazy\" width=\"300\" height=\"161\" src=\"https:\/\/escience.sdu.dk\/wp-content\/uploads\/2022\/01\/Screen-Shot-2022-01-31-at-08.44.51-300x161.png\" alt=\"\" data-id=\"7089\" data-full-url=\"http:\/\/escience.sdu.dk\/wp-content\/uploads\/2022\/01\/Screen-Shot-2022-01-31-at-08.44.51.png\" data-link=\"http:\/\/escience.sdu.dk\/?attachment_id=7089\" class=\"wp-image-7089\" srcset=\"http:\/\/escience.sdu.dk\/wp-content\/uploads\/2022\/01\/Screen-Shot-2022-01-31-at-08.44.51-300x161.png 300w, http:\/\/escience.sdu.dk\/wp-content\/uploads\/2022\/01\/Screen-Shot-2022-01-31-at-08.44.51-1024x551.png 1024w, http:\/\/escience.sdu.dk\/wp-content\/uploads\/2022\/01\/Screen-Shot-2022-01-31-at-08.44.51-768x413.png 768w, http:\/\/escience.sdu.dk\/wp-content\/uploads\/2022\/01\/Screen-Shot-2022-01-31-at-08.44.51-1536x827.png 1536w, http:\/\/escience.sdu.dk\/wp-content\/uploads\/2022\/01\/Screen-Shot-2022-01-31-at-08.44.51.png 1802w\" sizes=\"(max-width: 300px) 100vw, 300px\" \/><\/figure><\/li><li class=\"blocks-gallery-item\"><figure><img loading=\"lazy\" width=\"300\" height=\"164\" src=\"https:\/\/escience.sdu.dk\/wp-content\/uploads\/2022\/01\/Screen-Shot-2022-01-04-at-12.26.40-300x164.png\" alt=\"\" data-id=\"7072\" data-full-url=\"http:\/\/escience.sdu.dk\/wp-content\/uploads\/2022\/01\/Screen-Shot-2022-01-04-at-12.26.40.png\" data-link=\"http:\/\/escience.sdu.dk\/?attachment_id=7072\" class=\"wp-image-7072\" srcset=\"http:\/\/escience.sdu.dk\/wp-content\/uploads\/2022\/01\/Screen-Shot-2022-01-04-at-12.26.40-300x164.png 300w, http:\/\/escience.sdu.dk\/wp-content\/uploads\/2022\/01\/Screen-Shot-2022-01-04-at-12.26.40-1024x561.png 1024w, http:\/\/escience.sdu.dk\/wp-content\/uploads\/2022\/01\/Screen-Shot-2022-01-04-at-12.26.40-768x421.png 768w, http:\/\/escience.sdu.dk\/wp-content\/uploads\/2022\/01\/Screen-Shot-2022-01-04-at-12.26.40.png 1496w\" sizes=\"(max-width: 300px) 100vw, 300px\" \/><\/figure><\/li><\/ul><\/figure>\n\n\n\n<p><em>In the tables above, each note has received a score from -1 to 1. The closer the score is to 1, the more similar is the note to the fairytale and vice versa. Note_61, e.g., where the low score indicates that it is very different from all fairytales, is a shopping list.<\/em><\/p>\n\n\n\n<p>You can read a more detailed interview with Zhiru Sun <a href=\"https:\/\/escience.sdu.dk\/index.php\/news\/digital-humanities\/\" data-type=\"URL\" data-id=\"https:\/\/escience.sdu.dk\/index.php\/news\/digital-humanities\/\" target=\"_blank\" rel=\"noreferrer noopener\">here<\/a>.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>PI:&nbsp;Nils Holger&nbsp;Berg,&nbsp;The Hans Christian Andersen Centre,&nbsp;Department for the Study of Culture The project\u00a0Detecting text reuse in H.C. Andersen\u2019s work\u00a0is part of a larger publication project,\u00a0Hans Christian Andersen\u2019s Fairy Tales and Stories \u2013 the digital manuscript<a class=\"moretag\" href=\"http:\/\/escience.sdu.dk\/index.php\/2022\/03\/02\/detecting-text-reuse-in-h-c-andersens-work\/\"> Read more&hellip;<\/a><\/p>\n","protected":false},"author":11,"featured_media":7073,"comment_status":"closed","ping_status":"open","sticky":false,"template":"","format":"standard","meta":[],"categories":[],"tags":[],"_links":{"self":[{"href":"http:\/\/escience.sdu.dk\/index.php\/wp-json\/wp\/v2\/posts\/7083"}],"collection":[{"href":"http:\/\/escience.sdu.dk\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"http:\/\/escience.sdu.dk\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"http:\/\/escience.sdu.dk\/index.php\/wp-json\/wp\/v2\/users\/11"}],"replies":[{"embeddable":true,"href":"http:\/\/escience.sdu.dk\/index.php\/wp-json\/wp\/v2\/comments?post=7083"}],"version-history":[{"count":21,"href":"http:\/\/escience.sdu.dk\/index.php\/wp-json\/wp\/v2\/posts\/7083\/revisions"}],"predecessor-version":[{"id":7121,"href":"http:\/\/escience.sdu.dk\/index.php\/wp-json\/wp\/v2\/posts\/7083\/revisions\/7121"}],"wp:featuredmedia":[{"embeddable":true,"href":"http:\/\/escience.sdu.dk\/index.php\/wp-json\/wp\/v2\/media\/7073"}],"wp:attachment":[{"href":"http:\/\/escience.sdu.dk\/index.php\/wp-json\/wp\/v2\/media?parent=7083"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"http:\/\/escience.sdu.dk\/index.php\/wp-json\/wp\/v2\/categories?post=7083"},{"taxonomy":"post_tag","embeddable":true,"href":"http:\/\/escience.sdu.dk\/index.php\/wp-json\/wp\/v2\/tags?post=7083"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}