class: center, middle, inverse, title-slide .title[ # Programming Tools in Data Science ] .subtitle[ ## Lecture #8: Webscraping ] .author[ ### Samuel Orso ] .date[ ### 17 October 2024 ] --- # Webscraping with R ``` r library(rvest) url <- "https://ptds.samorso.ch/lectures/" read_html(url) %>% html_table() %>% .[[1]] %>% .[5:7,] %>% kableExtra::kable() ``` <table> <thead> <tr> <th style="text-align:right;"> Week </th> <th style="text-align:left;"> Date </th> <th style="text-align:left;"> Topic </th> <th style="text-align:left;"> Instructor </th> </tr> </thead> <tbody> <tr> <td style="text-align:right;"> 5 </td> <td style="text-align:left;"> 17 Oct </td> <td style="text-align:left;"> Object-oriented programming, Webscraping, Shiny App </td> <td style="text-align:left;"> Samuel </td> </tr> <tr> <td style="text-align:right;"> 6 </td> <td style="text-align:left;"> 24 Oct </td> <td style="text-align:left;"> Exercise and Homework 2 </td> <td style="text-align:left;"> Timofei </td> </tr> <tr> <td style="text-align:right;"> 7 </td> <td style="text-align:left;"> 31 Oct </td> <td style="text-align:left;"> Functional programming, Package creation, Advanced shiny App, </td> <td style="text-align:left;"> Samuel </td> </tr> </tbody> </table> --- # API * **A**pplication **P**rogramming **I**nterface are gold standard for fetching data from the web * Data is fetched by directly posing HTTP requests. * Data requests from `R` using `library(httr)` or API wrappers. * Data fetched through the API is generally more reliable. <table> <thead> <tr> <th style="text-align:left;"> Provider </th> <th style="text-align:left;"> Registration </th> <th style="text-align:left;"> Wrapper </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;"> Twitter </td> <td style="text-align:left;"> TRUE </td> <td style="text-align:left;"> TRUE </td> </tr> <tr> <td style="text-align:left;"> Financial Times </td> <td style="text-align:left;"> TRUE </td> <td style="text-align:left;"> TRUE </td> </tr> <tr> <td style="text-align:left;"> Open Weather Map </td> <td style="text-align:left;"> TRUE </td> <td style="text-align:left;"> TRUE </td> </tr> <tr> <td style="text-align:left;"> DeepL </td> <td style="text-align:left;"> TRUE </td> <td style="text-align:left;"> TRUE </td> </tr> </tbody> </table> --- # API example: Wikipedia pageviews ``` r library(pageviews) top_articles("en.wikipedia", start = (Sys.Date()-1)) %>% dplyr::select(article, views) %>% dplyr::top_n(10) ``` ``` ## Selecting by views ``` ``` ## article views ## 1 Main_Page 4607375 ## 2 Liam_Payne 2439358 ## 3 Special:Search 1326444 ## 4 Wikipedia:Featured_pictures 702086 ## 5 Cheryl_(singer) 294461 ## 6 One_Direction 287302 ## 7 Lyle_and_Erik_Menendez 257281 ## 8 Thomas_Tuchel 240577 ## 9 Lawrence_Bishnoi 173467 ## 10 Deaths_in_2024 148410 ``` --- # API example: translation with Deepl ```r library(deeplr) deeplr::translate2( text = "Mais quelle bonne traduction nom d'une pipe!", target_lang = "EN", auth_key = my_key ) ``` ``` ## [1] "But what a great translation!" ``` This is what we obtain on Google translate: > But what a good translation of the name of a pipe! --- # API Example: ChatGPT ``` r library(chatgpt) cat(ask_chatgpt("What do you think about the Programming Tools in Data Science class in R?")) ``` ``` ## ## *** ChatGPT input: ## ## What do you think about the Programming Tools in Data Science class in R? ``` ``` ## The Programming Tools in Data Science class in R is a great resource for learning essential programming skills for data science tasks. By mastering tools and techniques in R, you'll be well-equipped to efficiently manipulate and analyze datasets, visualize data, and perform statistical analysis. This class can help you become fluent in R and improve your overall proficiency in data science. ``` --- # Webscraping with R * If API is not available, e.g. there is no `R` package on CRAN or GitHub, you could try to build your own API by following for example [this tutorial](https://colinfay.me/build-api-wrapper-package-r/) or [that one](https://httr2.r-lib.org/articles/wrapping-apis.html) (not covered in this class). * Instead, we discuss webscraping, a method that is effective regardless of whether a website offers an API. --- # Scraping? <center> <div style="width:800px"><iframe allow="fullscreen" frameBorder="0" height="450" src="https://giphy.com/embed/Q8VCAek0MGjRK" width="800"></iframe></div> </center> --- # HTTP request/response cycle <img src="images/http_request_response.png" width="1680" /> --- # HyperText Markup Language ``` html <!DOCTYPE html> <html> <body> <h1 id='first'>Webscraping with R</h1> <p> Basic experience with <a href="www.r-project.org">R</a> and familiarity with the <em>Tidyverse</em> is recommended.</p> <h2>Technologies</h2> <ol> <li>HTML: <em>Hypertext Markup Language</em></li> <li>CSS: <em>Cascading Style Sheets</em></li> </ol> <h2>Packages</h2> <ul> <a href="https://github.com/tidyverse/rvest"><li>rvest</li></a> </ul> <p><strong>Note</strong>: <em>rvest</em> is included in the <em>tidyverse</em></p> </body> </html> ``` .bottom[[Try it!](https://www.w3schools.com/html/tryit.asp?filename=tryhtml_default)] --- # HTML * **element** starts with `<tag>` and ends `</tag>`, * it has optional **attributes** (`id=attribute`), * **content** is everything between two tags. * For example, add the attribute `style="background-color:DodgerBlue;"` to `h1` and try it. --- # HTML elements tag | meaning --- | --- p | Paragraph h1 | Top-level heading h2, h3, ... | Lower level headings ol | Ordered list ul | Unorder list li | List item img | Image a | Anchor (Hyperlink) div | Section wrapper (block-level) span | Text wrapper (in-line) Find out more tags [here](https://developer.mozilla.org/en-US/docs/Web/HTML) or [here](https://www.w3schools.com/tags/) --- # Data extraction Create a HTML page with `minimal_html` for experimenting ``` r html_page <- minimal_html(' <body> <h1>Webscraping with R</h1> <p> Basic experience with <a href="www.r-project.org">R</a> and familiarity with the <em>Tidyverse</em> is recommended.</p> <h2>Technologies</h2> <ol> <li>HTML: <em>Hypertext Markup Language</em></li> <li>CSS: <em>Cascading Style Sheets</em></li> </ol> <h2>Packages</h2> <ul> <a href="https://github.com/tidyverse/rvest"><li>rvest</li></a> </ul> <p><strong>Note</strong>: <em>rvest</em> is included in the <em>tidyverse</em></p> </body>') ``` --- # Example: list item (li) ``` html ... <h2>Technologies</h2> <ol> * <li>HTML: <em>Hypertext Markup Language</em></li> * <li>CSS: <em>Cascading Style Sheets</em></li> </ol> <h2>Packages</h2> <ul> * <a href="https://github.com/tidyverse/rvest"><li>rvest</li></a> </ul> ... ``` ``` r html_page %>% html_nodes("li") ``` ``` ## {xml_nodeset (3)} ## [1] <li>HTML: <em>Hypertext Markup Language</em>\n</li> ## [2] <li>CSS: <em>Cascading Style Sheets</em>\n</li> ## [3] <li>rvest</li> ``` ``` r html_page %>% html_nodes("li") %>% html_text() ``` ``` ## [1] "HTML: Hypertext Markup Language" "CSS: Cascading Style Sheets" ## [3] "rvest" ``` --- # Example: heading of order 2 (h2) ``` html ... * <h2>Technologies</h2> <ol> <li>HTML: <em>Hypertext Markup Language</em></li> <li>CSS: <em>Cascading Style Sheets</em></li> </ol> * <h2>Packages</h2> <ul> <a href="https://github.com/tidyverse/rvest"><li>rvest</li></a> </ul> ... ``` ``` r html_page %>% html_nodes("h2") %>% html_text() ``` ``` ## [1] "Technologies" "Packages" ``` --- # Example: emphasized text (em) ``` html <p> Basic experience with <a href="www.r-project.org">R</a> and * familiarity with the <em>Tidyverse</em> is recommended.</p> <h2>Technologies</h2> <ol> * <li>HTML: <em>Hypertext Markup Language</em></li> * <li>CSS: <em>Cascading Style Sheets</em></li> </ol> <h2>Packages</h2> <ul> <a href="https://github.com/tidyverse/rvest"><li>rvest</li></a> </ul> <p><strong>Note</strong>: * <em>rvest</em> is included in the <em>tidyverse</em></p> ``` ``` r html_page %>% html_nodes("em") %>% html_text() ``` ``` ## [1] "Tidyverse" "Hypertext Markup Language" ## [3] "Cascading Style Sheets" "rvest" ## [5] "tidyverse" ``` --- # Cascading Style Sheets (CSS) * CSS is used to specify the style (appearance, arrangement and variations) of your web pages. ``` html <style> body { background-color: lightblue; } h1 { color: white; text-align: center; } .content { font-family: monospace; font-size: 1.5em; color: black; } #intro { background-color: lightgrey; border-style: solid; border-width: 5px; padding: 5px; margin: 5px; text-align: center; } </style> ... ``` --- # Combining commands with CSS selector selector | meaning --- | --- , | grouping space | descendant > | child + | adjacent sibling ~ | general sibling :first-child | first element :nth-child(n) | n element :last-child | last element . | class selector # | id selector .center[[CSS diner](https://flukeout.github.io/), [CSS selector](https://developer.mozilla.org/en-US/docs/Web/CSS/CSS_Selectors), [W3 School](https://www.w3schools.com/css/css_selectors.asp)] --- # CSS Selector: grouping (`,`) * The grouping selector selects all the HTML elements with the same style definitions. * For example, `div, p` selects all `<div>` elements and all `<em>` elements. --- # Example: grouping `li` and `em` ``` html <p> Basic experience with <a href="www.r-project.org">R</a> and familiarity with the <em>Tidyverse</em> is recommended.</p> <h2>Technologies</h2> <ol> <li>HTML: <em>Hypertext Markup Language</em></li> <li>CSS: <em>Cascading Style Sheets</em></li> </ol> <h2>Packages</h2> <ul> <a href="https://github.com/tidyverse/rvest"><li>rvest</li></a> </ul> <p><strong>Note</strong>: <em>rvest</em> is included in the <em>tidyverse</em></p> ``` --- # Example: grouping `li` and `em` ``` html <p> Basic experience with <a href="www.r-project.org">R</a> and * familiarity with the <em>Tidyverse</em> is recommended.</p> <h2>Technologies</h2> <ol> * <li>HTML: <em>Hypertext Markup Language</em></li> * <li>CSS: <em>Cascading Style Sheets</em></li> </ol> <h2>Packages</h2> <ul> * <a href="https://github.com/tidyverse/rvest"><li>rvest</li></a> </ul> <p><strong>Note</strong>: * <em>rvest</em> is included in the <em>tidyverse</em></p> ``` ``` r html_page %>% html_nodes("li, em") %>% html_text() ``` ``` ## [1] "Tidyverse" "HTML: Hypertext Markup Language" ## [3] "Hypertext Markup Language" "CSS: Cascading Style Sheets" ## [5] "Cascading Style Sheets" "rvest" ## [7] "rvest" "tidyverse" ``` --- # CSS Selector: descendant selector (`space`) * The descendant selector matches all elements that are descendants of a specified element. * For example, `div p` selects all `<p>` elements inside `<div>` elements. --- # Example: all `em` that are descendants of `li` ``` html <p> Basic experience with <a href="www.r-project.org">R</a> and familiarity with the <em>Tidyverse</em> is recommended.</p> <h2>Technologies</h2> <ol> <li>HTML: <em>Hypertext Markup Language</em></li> <li>CSS: <em>Cascading Style Sheets</em></li> </ol> <h2>Packages</h2> <ul> <a href="https://github.com/tidyverse/rvest"><li>rvest</li></a> </ul> <p><strong>Note</strong>: <em>rvest</em> is included in the <em>tidyverse</em></p> ``` --- # Example: all `em` that are descendants of `li` ``` html <p> Basic experience with <a href="www.r-project.org">R</a> and familiarity with the <em>Tidyverse</em> is recommended.</p> <h2>Technologies</h2> <ol> * <li>HTML: <em>Hypertext Markup Language</em></li> * <li>CSS: <em>Cascading Style Sheets</em></li> </ol> <h2>Packages</h2> <ul> <a href="https://github.com/tidyverse/rvest"><li>rvest</li></a> </ul> <p><strong>Note</strong>: <em>rvest</em> is included in the <em>tidyverse</em></p> ``` ``` r html_page %>% html_nodes("li em") %>% html_text() ``` ``` ## [1] "Hypertext Markup Language" "Cascading Style Sheets" ``` --- # CSS Selector: child selector (`>`) * The child selector selects all elements that are the children of a specified element. * For example, `div > p` selects all `<p>` elements that are children of a `<div>` element. --- # Example: all `em` that are children of `p` ``` html <p> Basic experience with <a href="www.r-project.org">R</a> and familiarity with the <em>Tidyverse</em> is recommended.</p> <h2>Technologies</h2> <ol> <li>HTML: <em>Hypertext Markup Language</em></li> <li>CSS: <em>Cascading Style Sheets</em></li> </ol> <h2>Packages</h2> <ul> <a href="https://github.com/tidyverse/rvest"><li>rvest</li></a> </ul> <p><strong>Note</strong>: <em>rvest</em> is included in the <em>tidyverse</em></p> ``` --- # Example: all `em` that are children of `p` ``` html <p> Basic experience with <a href="www.r-project.org">R</a> and * familiarity with the <em>Tidyverse</em> is recommended.</p> <h2>Technologies</h2> <ol> <li>HTML: <em>Hypertext Markup Language</em></li> <li>CSS: <em>Cascading Style Sheets</em></li> </ol> <h2>Packages</h2> <ul> <a href="https://github.com/tidyverse/rvest"><li>rvest</li></a> </ul> <p><strong>Note</strong>: * <em>rvest</em> is included in the <em>tidyverse</em></p> ``` ``` r html_page %>% html_nodes("p > em") %>% html_text() ``` ``` ## [1] "Tidyverse" "rvest" "tidyverse" ``` --- # CSS Selector: adjacent sibling selector (`+`) * The adjacent sibling selector is used to select an element that is directly after another specific element. * Sibling elements must have the same parent element, and "adjacent" means "immediately following". * For example, `div + p` selects the first `<p>` element that is situated immediately after `<div>` elements. --- # Example: `em` immediately after `p` ``` html <p> Basic experience with <a href="www.r-project.org">R</a> and familiarity with the <em>Tidyverse</em> is recommended.</p> <h2>Technologies</h2> <ol> <li>HTML: <em>Hypertext Markup Language</em></li> <li>CSS: <em>Cascading Style Sheets</em></li> </ol> <h2>Packages</h2> <ul> <a href="https://github.com/tidyverse/rvest"><li>rvest</li></a> </ul> <p><strong>Note</strong>: <em>rvest</em> is included in the <em>tidyverse</em></p> ``` --- # Example: `em` immediately after `p` ``` html <p> Basic experience with <a href="www.r-project.org">R</a> and familiarity with the <em>Tidyverse</em> is recommended.</p> <h2>Technologies</h2> <ol> <li>HTML: <em>Hypertext Markup Language</em></li> <li>CSS: <em>Cascading Style Sheets</em></li> </ol> <h2>Packages</h2> <ul> <a href="https://github.com/tidyverse/rvest"><li>rvest</li></a> </ul> <p><strong>Note</strong>: <em>rvest</em> is included in the <em>tidyverse</em></p> ``` ``` r html_page %>% html_nodes("p + em") %>% html_text() ``` ``` ## character(0) ``` No `em` are immediately after `p`. --- # Example: `em` immediately after `em` ``` html <p> Basic experience with <a href="www.r-project.org">R</a> and familiarity with the <em>Tidyverse</em> is recommended.</p> <h2>Technologies</h2> <ol> <li>HTML: <em>Hypertext Markup Language</em></li> <li>CSS: <em>Cascading Style Sheets</em></li> </ol> <h2>Packages</h2> <ul> <a href="https://github.com/tidyverse/rvest"><li>rvest</li></a> </ul> <p><strong>Note</strong>: <em>rvest</em> is included in the <em>tidyverse</em></p> ``` --- # Example: `em` immediately after `em` ``` html <p> Basic experience with <a href="www.r-project.org">R</a> and familiarity with the <em>Tidyverse</em> is recommended.</p> <h2>Technologies</h2> <ol> <li>HTML: <em>Hypertext Markup Language</em></li> <li>CSS: <em>Cascading Style Sheets</em></li> </ol> <h2>Packages</h2> <ul> <a href="https://github.com/tidyverse/rvest"><li>rvest</li></a> </ul> <p><strong>Note</strong>: * <em>rvest</em> is included in the <em>tidyverse</em></p> ``` ``` r html_page %>% html_nodes("em + em") %>% html_text() ``` ``` ## [1] "tidyverse" ``` --- # CSS Selector: general sibling selector (`~`) * The general sibling selector selects all elements that are next siblings of a specified element. * Sibling elements must have the same parent element, and "general" means "any place". * For example, `div ~ p` selects all `<p>` elements that are preceded by a `<div>` element. --- # Example: `em` next sibling of `a` ``` html <p> Basic experience with <a href="www.r-project.org">R</a> and familiarity with the <em>Tidyverse</em> is recommended.</p> <h2>Technologies</h2> <ol> <li>HTML: <em>Hypertext Markup Language</em></li> <li>CSS: <em>Cascading Style Sheets</em></li> </ol> <h2>Packages</h2> <ul> <a href="https://github.com/tidyverse/rvest"><li>rvest</li></a> </ul> <p><strong>Note</strong>: <em>rvest</em> is included in the <em>tidyverse</em></p> ``` --- # Example: `em` next sibling of `a` ``` html * <p> Basic experience with <a href="www.r-project.org">R</a> and * familiarity with the <em>Tidyverse</em> is recommended.</p> <h2>Technologies</h2> <ol> <li>HTML: <em>Hypertext Markup Language</em></li> <li>CSS: <em>Cascading Style Sheets</em></li> </ol> <h2>Packages</h2> <ul> <a href="https://github.com/tidyverse/rvest"><li>rvest</li></a> </ul> <p><strong>Note</strong>: <em>rvest</em> is included in the <em>tidyverse</em></p> ``` ``` r html_page %>% html_nodes("a ~ em") %>% html_text() ``` ``` ## [1] "Tidyverse" ``` (Here, we would have obtained the same result with `a + em`) --- # CSS Selector: first child selector (`:first-child`) * `:first-child` selects the specified element that is the first child of another element. * For example, `p:first-child` selects all `<p>` elements that are the first child of any other element. --- # Example: all `li` that are first children ``` html <p> Basic experience with <a href="www.r-project.org">R</a> and familiarity with the <em>Tidyverse</em> is recommended.</p> <h2>Technologies</h2> <ol> <li>HTML: <em>Hypertext Markup Language</em></li> <li>CSS: <em>Cascading Style Sheets</em></li> </ol> <h2>Packages</h2> <ul> <a href="https://github.com/tidyverse/rvest"><li>rvest</li></a> </ul> <p><strong>Note</strong>: <em>rvest</em> is included in the <em>tidyverse</em></p> ``` --- # Example: all `li` that are first children ``` html <p> Basic experience with <a href="www.r-project.org">R</a> and familiarity with the <em>Tidyverse</em> is recommended.</p> <h2>Technologies</h2> <ol> * <li>HTML: <em>Hypertext Markup Language</em></li> <li>CSS: <em>Cascading Style Sheets</em></li> </ol> <h2>Packages</h2> <ul> * <a href="https://github.com/tidyverse/rvest"><li>rvest</li></a> </ul> <p><strong>Note</strong>: <em>rvest</em> is included in the <em>tidyverse</em></p> ``` ``` r html_page %>% html_nodes("li:first-child") %>% html_text() ``` ``` ## [1] "HTML: Hypertext Markup Language" "rvest" ``` --- # CSS Selector: nth child selector (`:nth-child(n)`) * Remark: `:last-child` is completely symmetric to `:first-child`. * `:nth-child(n)` selects the specified element that is the nth child of another element. * For example, `p:nth-child(2)` selects all `<p>` elements that are the second child of any other element. --- # Example: all `li` that are second children ``` html <p> Basic experience with <a href="www.r-project.org">R</a> and familiarity with the <em>Tidyverse</em> is recommended.</p> <h2>Technologies</h2> <ol> <li>HTML: <em>Hypertext Markup Language</em></li> <li>CSS: <em>Cascading Style Sheets</em></li> </ol> <h2>Packages</h2> <ul> <a href="https://github.com/tidyverse/rvest"><li>rvest</li></a> </ul> <p><strong>Note</strong>: <em>rvest</em> is included in the <em>tidyverse</em></p> ``` --- # Example: all `li` that are second children ``` html <p> Basic experience with <a href="www.r-project.org">R</a> and familiarity with the <em>Tidyverse</em> is recommended.</p> <h2>Technologies</h2> <ol> <li>HTML: <em>Hypertext Markup Language</em></li> * <li>CSS: <em>Cascading Style Sheets</em></li> </ol> <h2>Packages</h2> <ul> <a href="https://github.com/tidyverse/rvest"><li>rvest</li></a> </ul> <p><strong>Note</strong>: <em>rvest</em> is included in the <em>tidyverse</em></p> ``` ``` r html_page %>% html_nodes("li:nth-child(2)") %>% html_text() ``` ``` ## [1] "CSS: Cascading Style Sheets" ``` --- # HTML attributes * All HTML elements can have attributes, additional information about elements. * Attributes are always specified in the start tag, usually in the format `name="value"`. * For example, `<a href="www.r-project.org">R</a>`, `href` is an attribute of `a` that specifies an url. * Attributes can be accessed with `html_attr` command. --- # Example: `href` attributes ``` html <p> Basic experience with <a href="www.r-project.org">R</a> and familiarity with the <em>Tidyverse</em> is recommended.</p> <h2>Technologies</h2> <ol> <li>HTML: <em>Hypertext Markup Language</em></li> <li>CSS: <em>Cascading Style Sheets</em></li> </ol> <h2>Packages</h2> <ul> <a href="https://github.com/tidyverse/rvest"><li>rvest</li></a> </ul> <p><strong>Note</strong>: <em>rvest</em> is included in the <em>tidyverse</em></p> ``` --- # Example: `href` attributes ``` html * <p> Basic experience with <a href="www.r-project.org">R</a> and familiarity with the <em>Tidyverse</em> is recommended.</p> <h2>Technologies</h2> <ol> <li>HTML: <em>Hypertext Markup Language</em></li> <li>CSS: <em>Cascading Style Sheets</em></li> </ol> <h2>Packages</h2> <ul> * <a href="https://github.com/tidyverse/rvest"><li>rvest</li></a> </ul> <p><strong>Note</strong>: <em>rvest</em> is included in the <em>tidyverse</em></p> ``` ``` r html_page %>% html_nodes("a") %>% html_attr("href") ``` ``` ## [1] "www.r-project.org" "https://github.com/tidyverse/rvest" ``` --- # HTML tables tag | meaning --- | --- table | Table section tr | Table row td | Table cell th | Table header * Tables can be fetched by using the command `html_table()` --- ``` r basic_table <- minimal_html(' <body> <table> <tr> <th>Month</th> <th>Savings</th> </tr> <tr> <td>January</td> <td>$100</td> </tr> <tr> <td>February</td> <td>$80</td> </tr> </table> </body> ') ``` ``` r basic_table %>% html_table() ``` ``` ## [[1]] ## # A tibble: 2 × 2 ## Month Savings ## <chr> <chr> ## 1 January $100 ## 2 February $80 ``` --- # Example: Wikipedia table * We would like to fetch the table with Qualified teams of the Rugby World Cup 2023 on Wikipedia. * A first solution: fetch all tables and select the correct one. ``` r url <- "https://en.wikipedia.org/wiki/2023_Rugby_World_Cup" url %>% read_html() %>% html_table() %>% .[[4]] %>% kableExtra::kable() ``` <table> <thead> <tr> <th style="text-align:left;"> Region </th> <th style="text-align:left;"> Team </th> <th style="text-align:left;"> Qualificationmethod </th> <th style="text-align:right;"> Previous.mw-parser-output .tooltip-dotted{border-bottom:1px dotted;cursor:help}apps </th> <th style="text-align:left;"> Previous best result </th> <th style="text-align:right;"> World Rank¹ </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;"> Africa </td> <td style="text-align:left;"> South Africa </td> <td style="text-align:left;"> Top 3 in 2019 RWC pool </td> <td style="text-align:right;"> 7 </td> <td style="text-align:left;"> Champions (1995, 2007, 2019) </td> <td style="text-align:right;"> 2 </td> </tr> <tr> <td style="text-align:left;"> Africa </td> <td style="text-align:left;"> Namibia </td> <td style="text-align:left;"> Africa 1 </td> <td style="text-align:right;"> 6 </td> <td style="text-align:left;"> Pool stage (six times) </td> <td style="text-align:right;"> 21 </td> </tr> <tr> <td style="text-align:left;"> Asia </td> <td style="text-align:left;"> Japan </td> <td style="text-align:left;"> Top 3 in 2019 RWC pool </td> <td style="text-align:right;"> 9 </td> <td style="text-align:left;"> Quarter-finals (2019) </td> <td style="text-align:right;"> 14 </td> </tr> <tr> <td style="text-align:left;"> Europe </td> <td style="text-align:left;"> France </td> <td style="text-align:left;"> Hosts </td> <td style="text-align:right;"> 9 </td> <td style="text-align:left;"> Runners-up (1987, 1999, 2011) </td> <td style="text-align:right;"> 3 </td> </tr> <tr> <td style="text-align:left;"> Europe </td> <td style="text-align:left;"> England </td> <td style="text-align:left;"> Top 3 in 2019 RWC pool </td> <td style="text-align:right;"> 9 </td> <td style="text-align:left;"> Champions (2003) </td> <td style="text-align:right;"> 8 </td> </tr> <tr> <td style="text-align:left;"> Europe </td> <td style="text-align:left;"> Ireland </td> <td style="text-align:left;"> Top 3 in 2019 RWC pool </td> <td style="text-align:right;"> 9 </td> <td style="text-align:left;"> Quarter-finals (seven times) </td> <td style="text-align:right;"> 1 </td> </tr> <tr> <td style="text-align:left;"> Europe </td> <td style="text-align:left;"> Italy </td> <td style="text-align:left;"> Top 3 in 2019 RWC pool </td> <td style="text-align:right;"> 9 </td> <td style="text-align:left;"> Pool stage (nine times) </td> <td style="text-align:right;"> 13 </td> </tr> <tr> <td style="text-align:left;"> Europe </td> <td style="text-align:left;"> Scotland </td> <td style="text-align:left;"> Top 3 in 2019 RWC pool </td> <td style="text-align:right;"> 9 </td> <td style="text-align:left;"> Fourth place (1991) </td> <td style="text-align:right;"> 5 </td> </tr> <tr> <td style="text-align:left;"> Europe </td> <td style="text-align:left;"> Wales </td> <td style="text-align:left;"> Top 3 in 2019 RWC pool </td> <td style="text-align:right;"> 9 </td> <td style="text-align:left;"> Third place (1987) </td> <td style="text-align:right;"> 10 </td> </tr> <tr> <td style="text-align:left;"> Europe </td> <td style="text-align:left;"> Georgia </td> <td style="text-align:left;"> Europe 1 </td> <td style="text-align:right;"> 5 </td> <td style="text-align:left;"> Pool stage (five times) </td> <td style="text-align:right;"> 11 </td> </tr> <tr> <td style="text-align:left;"> Europe </td> <td style="text-align:left;"> Romania </td> <td style="text-align:left;"> Europe 2 </td> <td style="text-align:right;"> 8 </td> <td style="text-align:left;"> Pool stage (eight times) </td> <td style="text-align:right;"> 19 </td> </tr> <tr> <td style="text-align:left;"> Europe </td> <td style="text-align:left;"> Portugal </td> <td style="text-align:left;"> Final Qualifier </td> <td style="text-align:right;"> 1 </td> <td style="text-align:left;"> Pool stage (2007) </td> <td style="text-align:right;"> 16 </td> </tr> <tr> <td style="text-align:left;"> Oceania </td> <td style="text-align:left;"> Australia </td> <td style="text-align:left;"> Top 3 in 2019 RWC pool </td> <td style="text-align:right;"> 9 </td> <td style="text-align:left;"> Champions (1991, 1999) </td> <td style="text-align:right;"> 9 </td> </tr> <tr> <td style="text-align:left;"> Oceania </td> <td style="text-align:left;"> Fiji </td> <td style="text-align:left;"> Top 3 in 2019 RWC pool </td> <td style="text-align:right;"> 8 </td> <td style="text-align:left;"> Quarter-finals (1987, 2007) </td> <td style="text-align:right;"> 7 </td> </tr> <tr> <td style="text-align:left;"> Oceania </td> <td style="text-align:left;"> New Zealand </td> <td style="text-align:left;"> Top 3 in 2019 RWC pool </td> <td style="text-align:right;"> 9 </td> <td style="text-align:left;"> Champions (1987, 2011, 2015) </td> <td style="text-align:right;"> 4 </td> </tr> <tr> <td style="text-align:left;"> Oceania </td> <td style="text-align:left;"> Samoa </td> <td style="text-align:left;"> Oceania 1 </td> <td style="text-align:right;"> 8 </td> <td style="text-align:left;"> Quarter-finals (1991, 1995) </td> <td style="text-align:right;"> 12 </td> </tr> <tr> <td style="text-align:left;"> Oceania </td> <td style="text-align:left;"> Tonga </td> <td style="text-align:left;"> Asia/Pacific 1 </td> <td style="text-align:right;"> 8 </td> <td style="text-align:left;"> Pool stage (eight times) </td> <td style="text-align:right;"> 15 </td> </tr> <tr> <td style="text-align:left;"> South America and North America Rugby </td> <td style="text-align:left;"> Argentina </td> <td style="text-align:left;"> Top 3 in 2019 RWC pool </td> <td style="text-align:right;"> 9 </td> <td style="text-align:left;"> Third place (2007) </td> <td style="text-align:right;"> 6 </td> </tr> <tr> <td style="text-align:left;"> South America and North America Rugby </td> <td style="text-align:left;"> Uruguay </td> <td style="text-align:left;"> Americas 1 </td> <td style="text-align:right;"> 4 </td> <td style="text-align:left;"> Pool stage (1999, 2003, 2015, 2019) </td> <td style="text-align:right;"> 17 </td> </tr> <tr> <td style="text-align:left;"> South America and North America Rugby </td> <td style="text-align:left;"> Chile </td> <td style="text-align:left;"> Americas 2 </td> <td style="text-align:right;"> 0 </td> <td style="text-align:left;"> Debut </td> <td style="text-align:right;"> 22 </td> </tr> </tbody> </table> --- # Example: Wikipedia table * Inspect the HTML with the developer tools. <img src="images/wikitable_rugby.png" width="2485" /> --- # Example: Wikipedia table * A better solution using CSS selectors: using the class selector (`.`). * Select `class="wikitable"`. ``` r url <- "https://en.wikipedia.org/wiki/2023_Rugby_World_Cup" url %>% read_html() %>% html_nodes(".wikitable") %>% html_table() %>% .[[3]] %>% kableExtra::kable() # equivalently html_nodes("table.wikitable") ``` <table> <thead> <tr> <th style="text-align:left;"> Region </th> <th style="text-align:left;"> Team </th> <th style="text-align:left;"> Qualificationmethod </th> <th style="text-align:right;"> Previous.mw-parser-output .tooltip-dotted{border-bottom:1px dotted;cursor:help}apps </th> <th style="text-align:left;"> Previous best result </th> <th style="text-align:right;"> World Rank¹ </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;"> Africa </td> <td style="text-align:left;"> South Africa </td> <td style="text-align:left;"> Top 3 in 2019 RWC pool </td> <td style="text-align:right;"> 7 </td> <td style="text-align:left;"> Champions (1995, 2007, 2019) </td> <td style="text-align:right;"> 2 </td> </tr> <tr> <td style="text-align:left;"> Africa </td> <td style="text-align:left;"> Namibia </td> <td style="text-align:left;"> Africa 1 </td> <td style="text-align:right;"> 6 </td> <td style="text-align:left;"> Pool stage (six times) </td> <td style="text-align:right;"> 21 </td> </tr> <tr> <td style="text-align:left;"> Asia </td> <td style="text-align:left;"> Japan </td> <td style="text-align:left;"> Top 3 in 2019 RWC pool </td> <td style="text-align:right;"> 9 </td> <td style="text-align:left;"> Quarter-finals (2019) </td> <td style="text-align:right;"> 14 </td> </tr> <tr> <td style="text-align:left;"> Europe </td> <td style="text-align:left;"> France </td> <td style="text-align:left;"> Hosts </td> <td style="text-align:right;"> 9 </td> <td style="text-align:left;"> Runners-up (1987, 1999, 2011) </td> <td style="text-align:right;"> 3 </td> </tr> <tr> <td style="text-align:left;"> Europe </td> <td style="text-align:left;"> England </td> <td style="text-align:left;"> Top 3 in 2019 RWC pool </td> <td style="text-align:right;"> 9 </td> <td style="text-align:left;"> Champions (2003) </td> <td style="text-align:right;"> 8 </td> </tr> <tr> <td style="text-align:left;"> Europe </td> <td style="text-align:left;"> Ireland </td> <td style="text-align:left;"> Top 3 in 2019 RWC pool </td> <td style="text-align:right;"> 9 </td> <td style="text-align:left;"> Quarter-finals (seven times) </td> <td style="text-align:right;"> 1 </td> </tr> <tr> <td style="text-align:left;"> Europe </td> <td style="text-align:left;"> Italy </td> <td style="text-align:left;"> Top 3 in 2019 RWC pool </td> <td style="text-align:right;"> 9 </td> <td style="text-align:left;"> Pool stage (nine times) </td> <td style="text-align:right;"> 13 </td> </tr> <tr> <td style="text-align:left;"> Europe </td> <td style="text-align:left;"> Scotland </td> <td style="text-align:left;"> Top 3 in 2019 RWC pool </td> <td style="text-align:right;"> 9 </td> <td style="text-align:left;"> Fourth place (1991) </td> <td style="text-align:right;"> 5 </td> </tr> <tr> <td style="text-align:left;"> Europe </td> <td style="text-align:left;"> Wales </td> <td style="text-align:left;"> Top 3 in 2019 RWC pool </td> <td style="text-align:right;"> 9 </td> <td style="text-align:left;"> Third place (1987) </td> <td style="text-align:right;"> 10 </td> </tr> <tr> <td style="text-align:left;"> Europe </td> <td style="text-align:left;"> Georgia </td> <td style="text-align:left;"> Europe 1 </td> <td style="text-align:right;"> 5 </td> <td style="text-align:left;"> Pool stage (five times) </td> <td style="text-align:right;"> 11 </td> </tr> <tr> <td style="text-align:left;"> Europe </td> <td style="text-align:left;"> Romania </td> <td style="text-align:left;"> Europe 2 </td> <td style="text-align:right;"> 8 </td> <td style="text-align:left;"> Pool stage (eight times) </td> <td style="text-align:right;"> 19 </td> </tr> <tr> <td style="text-align:left;"> Europe </td> <td style="text-align:left;"> Portugal </td> <td style="text-align:left;"> Final Qualifier </td> <td style="text-align:right;"> 1 </td> <td style="text-align:left;"> Pool stage (2007) </td> <td style="text-align:right;"> 16 </td> </tr> <tr> <td style="text-align:left;"> Oceania </td> <td style="text-align:left;"> Australia </td> <td style="text-align:left;"> Top 3 in 2019 RWC pool </td> <td style="text-align:right;"> 9 </td> <td style="text-align:left;"> Champions (1991, 1999) </td> <td style="text-align:right;"> 9 </td> </tr> <tr> <td style="text-align:left;"> Oceania </td> <td style="text-align:left;"> Fiji </td> <td style="text-align:left;"> Top 3 in 2019 RWC pool </td> <td style="text-align:right;"> 8 </td> <td style="text-align:left;"> Quarter-finals (1987, 2007) </td> <td style="text-align:right;"> 7 </td> </tr> <tr> <td style="text-align:left;"> Oceania </td> <td style="text-align:left;"> New Zealand </td> <td style="text-align:left;"> Top 3 in 2019 RWC pool </td> <td style="text-align:right;"> 9 </td> <td style="text-align:left;"> Champions (1987, 2011, 2015) </td> <td style="text-align:right;"> 4 </td> </tr> <tr> <td style="text-align:left;"> Oceania </td> <td style="text-align:left;"> Samoa </td> <td style="text-align:left;"> Oceania 1 </td> <td style="text-align:right;"> 8 </td> <td style="text-align:left;"> Quarter-finals (1991, 1995) </td> <td style="text-align:right;"> 12 </td> </tr> <tr> <td style="text-align:left;"> Oceania </td> <td style="text-align:left;"> Tonga </td> <td style="text-align:left;"> Asia/Pacific 1 </td> <td style="text-align:right;"> 8 </td> <td style="text-align:left;"> Pool stage (eight times) </td> <td style="text-align:right;"> 15 </td> </tr> <tr> <td style="text-align:left;"> South America and North America Rugby </td> <td style="text-align:left;"> Argentina </td> <td style="text-align:left;"> Top 3 in 2019 RWC pool </td> <td style="text-align:right;"> 9 </td> <td style="text-align:left;"> Third place (2007) </td> <td style="text-align:right;"> 6 </td> </tr> <tr> <td style="text-align:left;"> South America and North America Rugby </td> <td style="text-align:left;"> Uruguay </td> <td style="text-align:left;"> Americas 1 </td> <td style="text-align:right;"> 4 </td> <td style="text-align:left;"> Pool stage (1999, 2003, 2015, 2019) </td> <td style="text-align:right;"> 17 </td> </tr> <tr> <td style="text-align:left;"> South America and North America Rugby </td> <td style="text-align:left;"> Chile </td> <td style="text-align:left;"> Americas 2 </td> <td style="text-align:right;"> 0 </td> <td style="text-align:left;"> Debut </td> <td style="text-align:right;"> 22 </td> </tr> </tbody> </table> --- # Example: Wikipedia table * A better solution using CSS selectors: using the class selector (`.`). * Select `class="wikitable sortable"`. ``` r url <- "https://en.wikipedia.org/wiki/2023_Rugby_World_Cup" url %>% read_html() %>% html_nodes(".wikitable.sortable") %>% html_table() %>% kableExtra::kable() # equivalently html_nodes("table.wikitable.sortable") ``` <table class="kable_wrapper"> <tbody> <tr> <td> <table> <thead> <tr> <th style="text-align:left;"> Region </th> <th style="text-align:left;"> Team </th> <th style="text-align:left;"> Qualificationmethod </th> <th style="text-align:right;"> Previous.mw-parser-output .tooltip-dotted{border-bottom:1px dotted;cursor:help}apps </th> <th style="text-align:left;"> Previous best result </th> <th style="text-align:right;"> World Rank¹ </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;"> Africa </td> <td style="text-align:left;"> South Africa </td> <td style="text-align:left;"> Top 3 in 2019 RWC pool </td> <td style="text-align:right;"> 7 </td> <td style="text-align:left;"> Champions (1995, 2007, 2019) </td> <td style="text-align:right;"> 2 </td> </tr> <tr> <td style="text-align:left;"> Africa </td> <td style="text-align:left;"> Namibia </td> <td style="text-align:left;"> Africa 1 </td> <td style="text-align:right;"> 6 </td> <td style="text-align:left;"> Pool stage (six times) </td> <td style="text-align:right;"> 21 </td> </tr> <tr> <td style="text-align:left;"> Asia </td> <td style="text-align:left;"> Japan </td> <td style="text-align:left;"> Top 3 in 2019 RWC pool </td> <td style="text-align:right;"> 9 </td> <td style="text-align:left;"> Quarter-finals (2019) </td> <td style="text-align:right;"> 14 </td> </tr> <tr> <td style="text-align:left;"> Europe </td> <td style="text-align:left;"> France </td> <td style="text-align:left;"> Hosts </td> <td style="text-align:right;"> 9 </td> <td style="text-align:left;"> Runners-up (1987, 1999, 2011) </td> <td style="text-align:right;"> 3 </td> </tr> <tr> <td style="text-align:left;"> Europe </td> <td style="text-align:left;"> England </td> <td style="text-align:left;"> Top 3 in 2019 RWC pool </td> <td style="text-align:right;"> 9 </td> <td style="text-align:left;"> Champions (2003) </td> <td style="text-align:right;"> 8 </td> </tr> <tr> <td style="text-align:left;"> Europe </td> <td style="text-align:left;"> Ireland </td> <td style="text-align:left;"> Top 3 in 2019 RWC pool </td> <td style="text-align:right;"> 9 </td> <td style="text-align:left;"> Quarter-finals (seven times) </td> <td style="text-align:right;"> 1 </td> </tr> <tr> <td style="text-align:left;"> Europe </td> <td style="text-align:left;"> Italy </td> <td style="text-align:left;"> Top 3 in 2019 RWC pool </td> <td style="text-align:right;"> 9 </td> <td style="text-align:left;"> Pool stage (nine times) </td> <td style="text-align:right;"> 13 </td> </tr> <tr> <td style="text-align:left;"> Europe </td> <td style="text-align:left;"> Scotland </td> <td style="text-align:left;"> Top 3 in 2019 RWC pool </td> <td style="text-align:right;"> 9 </td> <td style="text-align:left;"> Fourth place (1991) </td> <td style="text-align:right;"> 5 </td> </tr> <tr> <td style="text-align:left;"> Europe </td> <td style="text-align:left;"> Wales </td> <td style="text-align:left;"> Top 3 in 2019 RWC pool </td> <td style="text-align:right;"> 9 </td> <td style="text-align:left;"> Third place (1987) </td> <td style="text-align:right;"> 10 </td> </tr> <tr> <td style="text-align:left;"> Europe </td> <td style="text-align:left;"> Georgia </td> <td style="text-align:left;"> Europe 1 </td> <td style="text-align:right;"> 5 </td> <td style="text-align:left;"> Pool stage (five times) </td> <td style="text-align:right;"> 11 </td> </tr> <tr> <td style="text-align:left;"> Europe </td> <td style="text-align:left;"> Romania </td> <td style="text-align:left;"> Europe 2 </td> <td style="text-align:right;"> 8 </td> <td style="text-align:left;"> Pool stage (eight times) </td> <td style="text-align:right;"> 19 </td> </tr> <tr> <td style="text-align:left;"> Europe </td> <td style="text-align:left;"> Portugal </td> <td style="text-align:left;"> Final Qualifier </td> <td style="text-align:right;"> 1 </td> <td style="text-align:left;"> Pool stage (2007) </td> <td style="text-align:right;"> 16 </td> </tr> <tr> <td style="text-align:left;"> Oceania </td> <td style="text-align:left;"> Australia </td> <td style="text-align:left;"> Top 3 in 2019 RWC pool </td> <td style="text-align:right;"> 9 </td> <td style="text-align:left;"> Champions (1991, 1999) </td> <td style="text-align:right;"> 9 </td> </tr> <tr> <td style="text-align:left;"> Oceania </td> <td style="text-align:left;"> Fiji </td> <td style="text-align:left;"> Top 3 in 2019 RWC pool </td> <td style="text-align:right;"> 8 </td> <td style="text-align:left;"> Quarter-finals (1987, 2007) </td> <td style="text-align:right;"> 7 </td> </tr> <tr> <td style="text-align:left;"> Oceania </td> <td style="text-align:left;"> New Zealand </td> <td style="text-align:left;"> Top 3 in 2019 RWC pool </td> <td style="text-align:right;"> 9 </td> <td style="text-align:left;"> Champions (1987, 2011, 2015) </td> <td style="text-align:right;"> 4 </td> </tr> <tr> <td style="text-align:left;"> Oceania </td> <td style="text-align:left;"> Samoa </td> <td style="text-align:left;"> Oceania 1 </td> <td style="text-align:right;"> 8 </td> <td style="text-align:left;"> Quarter-finals (1991, 1995) </td> <td style="text-align:right;"> 12 </td> </tr> <tr> <td style="text-align:left;"> Oceania </td> <td style="text-align:left;"> Tonga </td> <td style="text-align:left;"> Asia/Pacific 1 </td> <td style="text-align:right;"> 8 </td> <td style="text-align:left;"> Pool stage (eight times) </td> <td style="text-align:right;"> 15 </td> </tr> <tr> <td style="text-align:left;"> South America and North America Rugby </td> <td style="text-align:left;"> Argentina </td> <td style="text-align:left;"> Top 3 in 2019 RWC pool </td> <td style="text-align:right;"> 9 </td> <td style="text-align:left;"> Third place (2007) </td> <td style="text-align:right;"> 6 </td> </tr> <tr> <td style="text-align:left;"> South America and North America Rugby </td> <td style="text-align:left;"> Uruguay </td> <td style="text-align:left;"> Americas 1 </td> <td style="text-align:right;"> 4 </td> <td style="text-align:left;"> Pool stage (1999, 2003, 2015, 2019) </td> <td style="text-align:right;"> 17 </td> </tr> <tr> <td style="text-align:left;"> South America and North America Rugby </td> <td style="text-align:left;"> Chile </td> <td style="text-align:left;"> Americas 2 </td> <td style="text-align:right;"> 0 </td> <td style="text-align:left;"> Debut </td> <td style="text-align:right;"> 22 </td> </tr> </tbody> </table> </td> <td> <table> <thead> <tr> <th style="text-align:left;"> Top 10 points scorers </th> <th style="text-align:left;"> Top 10 points scorers </th> <th style="text-align:left;"> Top 10 points scorers </th> <th style="text-align:left;"> Top 10 points scorers </th> <th style="text-align:left;"> Top 10 points scorers </th> <th style="text-align:left;"> Top 10 points scorers </th> <th style="text-align:left;"> Top 10 points scorers </th> <th style="text-align:left;"> Top 10 points scorers </th> <th style="text-align:left;"> Top 10 points scorers </th> <th style="text-align:left;"> Top 10 points scorers </th> <th style="text-align:left;"> Top 10 points scorers </th> <th style="text-align:left;"> Top 10 points scorers </th> <th style="text-align:left;"> Top 10 points scorers </th> <th style="text-align:left;"> Top 10 points scorers </th> <th style="text-align:left;"> Top 10 points scorers </th> <th style="text-align:left;"> Top 10 points scorers </th> <th style="text-align:left;"> Top 10 points scorers </th> <th style="text-align:left;"> Top 10 points scorers </th> <th style="text-align:left;"> Top 10 points scorers </th> <th style="text-align:left;"> Top 10 points scorers </th> <th style="text-align:left;"> Top 10 points scorers </th> <th style="text-align:left;"> Top 10 points scorers </th> <th style="text-align:left;"> Top 10 points scorers </th> <th style="text-align:left;"> Top 10 points scorers </th> <th style="text-align:left;"> Top 10 points scorers </th> <th style="text-align:left;"> Top 10 points scorers </th> <th style="text-align:left;"> Top 10 points scorers </th> <th style="text-align:left;"> Top 10 points scorers </th> <th style="text-align:left;"> Top 10 points scorers </th> <th style="text-align:left;"> Top 10 points scorers </th> <th style="text-align:left;"> Top 10 points scorers </th> <th style="text-align:left;"> Top 10 points scorers </th> <th style="text-align:left;"> Top 10 points scorers </th> <th style="text-align:left;"> Top 10 points scorers </th> <th style="text-align:left;"> Top 10 points scorers </th> <th style="text-align:left;"> Top 10 points scorers </th> <th style="text-align:left;"> Top 10 points scorers </th> <th style="text-align:left;"> Top 10 points scorers </th> <th style="text-align:left;"> Top 10 points scorers </th> <th style="text-align:left;"> Top 10 points scorers </th> <th style="text-align:left;"> Top 10 points scorers </th> <th style="text-align:left;"> Top 10 points scorers </th> <th style="text-align:left;"> Top 10 points scorers </th> <th style="text-align:left;"> Top 10 points scorers </th> <th style="text-align:left;"> Top 10 points scorers </th> <th style="text-align:left;"> Top 10 points scorers </th> <th style="text-align:left;"> Top 10 points scorers </th> <th style="text-align:left;"> Top 10 points scorers </th> <th style="text-align:left;"> Top 10 points scorers </th> <th style="text-align:left;"> Top 10 points scorers </th> <th style="text-align:left;"> Top 10 points scorers </th> <th style="text-align:left;"> Top 10 points scorers </th> <th style="text-align:left;"> Top 10 points scorers </th> <th style="text-align:left;"> Top 10 points scorers </th> <th style="text-align:left;"> Top 10 points scorers </th> <th style="text-align:left;"> Top 10 points scorers </th> <th style="text-align:left;"> Top 10 points scorers </th> <th style="text-align:left;"> Top 10 points scorers </th> <th style="text-align:left;"> Top 10 points scorers </th> <th style="text-align:left;"> Top 10 points scorers </th> <th style="text-align:left;"> Top 10 points scorers </th> <th style="text-align:left;"> Top 10 points scorers </th> <th style="text-align:left;"> Top 10 points scorers </th> <th style="text-align:left;"> Top 10 points scorers </th> <th style="text-align:left;"> Top 10 points scorers </th> <th style="text-align:left;"> Top 10 points scorers </th> <th style="text-align:left;"> Top 10 points scorers </th> <th style="text-align:left;"> Top 10 points scorers </th> <th style="text-align:left;"> Top 10 points scorers </th> <th style="text-align:left;"> Top 10 points scorers </th> <th style="text-align:left;"> Top 10 points scorers </th> <th style="text-align:left;"> Top 10 points scorers </th> <th style="text-align:left;"> Top 10 points scorers </th> <th style="text-align:left;"> Top 10 points scorers </th> <th style="text-align:left;"> Top 10 points scorers </th> <th style="text-align:left;"> Top 10 points scorers </th> <th style="text-align:left;"> Top 10 points scorers </th> <th style="text-align:left;"> Top 10 points scorers </th> <th style="text-align:left;"> Top 10 points scorers </th> <th style="text-align:left;"> Top 10 points scorers </th> <th style="text-align:left;"> Top 10 points scorers </th> <th style="text-align:left;"> Top 10 points scorers </th> <th style="text-align:left;"> Top 10 points scorers </th> <th style="text-align:left;"> Top 10 points scorers </th> <th style="text-align:left;"> Top 10 points scorers </th> <th style="text-align:left;"> Top 10 points scorers </th> <th style="text-align:left;"> Top 10 points scorers </th> <th style="text-align:left;"> Top 10 points scorers </th> <th style="text-align:left;"> Top 10 points scorers </th> <th style="text-align:left;"> Top 10 points scorers </th> <th style="text-align:left;"> Top 10 points scorers </th> <th style="text-align:left;"> Top 10 points scorers </th> <th style="text-align:left;"> Top 10 points scorers </th> <th style="text-align:left;"> Top 10 points scorers </th> <th style="text-align:left;"> Top 10 points scorers </th> <th style="text-align:left;"> Top 10 points scorers </th> <th style="text-align:left;"> Top 10 points scorers </th> <th style="text-align:left;"> Top 10 points scorers </th> <th style="text-align:left;"> Top 10 points scorers </th> <th style="text-align:left;"> Top 10 points scorers </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;"> Player </td> <td style="text-align:left;"> Team </td> <td style="text-align:left;"> Total </td> <td style="text-align:left;"> Details </td> <td style="text-align:left;"> Details </td> <td style="text-align:left;"> Details </td> <td style="text-align:left;"> Details </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> </tr> <tr> <td style="text-align:left;"> Player </td> <td style="text-align:left;"> Team </td> <td style="text-align:left;"> Total </td> <td style="text-align:left;"> </td> <td style="text-align:left;"> </td> <td style="text-align:left;"> </td> <td style="text-align:left;"> Tries </td> <td style="text-align:left;"> Conversions </td> <td style="text-align:left;"> Penalties </td> <td style="text-align:left;"> Drop goals </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> </tr> <tr> <td style="text-align:left;"> Owen Farrell </td> <td style="text-align:left;"> England </td> <td style="text-align:left;"> 75 </td> <td style="text-align:left;"> 0 </td> <td style="text-align:left;"> 12 </td> <td style="text-align:left;"> 15 </td> <td style="text-align:left;"> 2 </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> </tr> <tr> <td style="text-align:left;"> Thomas Ramos </td> <td style="text-align:left;"> France </td> <td style="text-align:left;"> 74 </td> <td style="text-align:left;"> 1 </td> <td style="text-align:left;"> 21 </td> <td style="text-align:left;"> 9 </td> <td style="text-align:left;"> 0 </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> </tr> <tr> <td style="text-align:left;"> Emiliano Boffelli </td> <td style="text-align:left;"> Argentina </td> <td style="text-align:left;"> 67 </td> <td style="text-align:left;"> 2 </td> <td style="text-align:left;"> 9 </td> <td style="text-align:left;"> 13 </td> <td style="text-align:left;"> 0 </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> </tr> <tr> <td style="text-align:left;"> Johnny Sexton </td> <td style="text-align:left;"> Ireland </td> <td style="text-align:left;"> 58 </td> <td style="text-align:left;"> 3 </td> <td style="text-align:left;"> 17 </td> <td style="text-align:left;"> 3 </td> <td style="text-align:left;"> 0 </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> </tr> <tr> <td style="text-align:left;"> Richie Mo'unga </td> <td style="text-align:left;"> New Zealand </td> <td style="text-align:left;"> 56 </td> <td style="text-align:left;"> 1 </td> <td style="text-align:left;"> 18 </td> <td style="text-align:left;"> 5 </td> <td style="text-align:left;"> 0 </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> </tr> <tr> <td style="text-align:left;"> Damian McKenzie </td> <td style="text-align:left;"> New Zealand </td> <td style="text-align:left;"> 53 </td> <td style="text-align:left;"> 5 </td> <td style="text-align:left;"> 14 </td> <td style="text-align:left;"> 0 </td> <td style="text-align:left;"> 0 </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> </tr> <tr> <td style="text-align:left;"> Rikiya Matsuda </td> <td style="text-align:left;"> Japan </td> <td style="text-align:left;"> 46 </td> <td style="text-align:left;"> 0 </td> <td style="text-align:left;"> 11 </td> <td style="text-align:left;"> 8 </td> <td style="text-align:left;"> 0 </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> </tr> <tr> <td style="text-align:left;"> Ben Donaldson </td> <td style="text-align:left;"> Australia </td> <td style="text-align:left;"> 45 </td> <td style="text-align:left;"> 2 </td> <td style="text-align:left;"> 7 </td> <td style="text-align:left;"> 7 </td> <td style="text-align:left;"> 0 </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> </tr> <tr> <td style="text-align:left;"> George Ford </td> <td style="text-align:left;"> England </td> <td style="text-align:left;"> 41 </td> <td style="text-align:left;"> 0 </td> <td style="text-align:left;"> 4 </td> <td style="text-align:left;"> 8 </td> <td style="text-align:left;"> 3 </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> </tr> <tr> <td style="text-align:left;"> Will Jordan </td> <td style="text-align:left;"> New Zealand </td> <td style="text-align:left;"> 40 </td> <td style="text-align:left;"> 8 </td> <td style="text-align:left;"> 0 </td> <td style="text-align:left;"> 0 </td> <td style="text-align:left;"> 0 </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> <td style="text-align:left;"> NA </td> </tr> </tbody> </table> </td> </tr> </tbody> </table> --- # Example: Wikipedia table * An alternative solution: select `table` immediately after four `p`. ``` r url <- "https://en.wikipedia.org/wiki/2023_Rugby_World_Cup" url %>% read_html() %>% html_nodes("p + p + p + p + table") %>% html_table() %>% kableExtra::kable() ``` <table class="kable_wrapper"> <tbody> <tr> <td> <table> <thead> <tr> <th style="text-align:left;"> Region </th> <th style="text-align:left;"> Team </th> <th style="text-align:left;"> Qualificationmethod </th> <th style="text-align:right;"> Previous.mw-parser-output .tooltip-dotted{border-bottom:1px dotted;cursor:help}apps </th> <th style="text-align:left;"> Previous best result </th> <th style="text-align:right;"> World Rank¹ </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;"> Africa </td> <td style="text-align:left;"> South Africa </td> <td style="text-align:left;"> Top 3 in 2019 RWC pool </td> <td style="text-align:right;"> 7 </td> <td style="text-align:left;"> Champions (1995, 2007, 2019) </td> <td style="text-align:right;"> 2 </td> </tr> <tr> <td style="text-align:left;"> Africa </td> <td style="text-align:left;"> Namibia </td> <td style="text-align:left;"> Africa 1 </td> <td style="text-align:right;"> 6 </td> <td style="text-align:left;"> Pool stage (six times) </td> <td style="text-align:right;"> 21 </td> </tr> <tr> <td style="text-align:left;"> Asia </td> <td style="text-align:left;"> Japan </td> <td style="text-align:left;"> Top 3 in 2019 RWC pool </td> <td style="text-align:right;"> 9 </td> <td style="text-align:left;"> Quarter-finals (2019) </td> <td style="text-align:right;"> 14 </td> </tr> <tr> <td style="text-align:left;"> Europe </td> <td style="text-align:left;"> France </td> <td style="text-align:left;"> Hosts </td> <td style="text-align:right;"> 9 </td> <td style="text-align:left;"> Runners-up (1987, 1999, 2011) </td> <td style="text-align:right;"> 3 </td> </tr> <tr> <td style="text-align:left;"> Europe </td> <td style="text-align:left;"> England </td> <td style="text-align:left;"> Top 3 in 2019 RWC pool </td> <td style="text-align:right;"> 9 </td> <td style="text-align:left;"> Champions (2003) </td> <td style="text-align:right;"> 8 </td> </tr> <tr> <td style="text-align:left;"> Europe </td> <td style="text-align:left;"> Ireland </td> <td style="text-align:left;"> Top 3 in 2019 RWC pool </td> <td style="text-align:right;"> 9 </td> <td style="text-align:left;"> Quarter-finals (seven times) </td> <td style="text-align:right;"> 1 </td> </tr> <tr> <td style="text-align:left;"> Europe </td> <td style="text-align:left;"> Italy </td> <td style="text-align:left;"> Top 3 in 2019 RWC pool </td> <td style="text-align:right;"> 9 </td> <td style="text-align:left;"> Pool stage (nine times) </td> <td style="text-align:right;"> 13 </td> </tr> <tr> <td style="text-align:left;"> Europe </td> <td style="text-align:left;"> Scotland </td> <td style="text-align:left;"> Top 3 in 2019 RWC pool </td> <td style="text-align:right;"> 9 </td> <td style="text-align:left;"> Fourth place (1991) </td> <td style="text-align:right;"> 5 </td> </tr> <tr> <td style="text-align:left;"> Europe </td> <td style="text-align:left;"> Wales </td> <td style="text-align:left;"> Top 3 in 2019 RWC pool </td> <td style="text-align:right;"> 9 </td> <td style="text-align:left;"> Third place (1987) </td> <td style="text-align:right;"> 10 </td> </tr> <tr> <td style="text-align:left;"> Europe </td> <td style="text-align:left;"> Georgia </td> <td style="text-align:left;"> Europe 1 </td> <td style="text-align:right;"> 5 </td> <td style="text-align:left;"> Pool stage (five times) </td> <td style="text-align:right;"> 11 </td> </tr> <tr> <td style="text-align:left;"> Europe </td> <td style="text-align:left;"> Romania </td> <td style="text-align:left;"> Europe 2 </td> <td style="text-align:right;"> 8 </td> <td style="text-align:left;"> Pool stage (eight times) </td> <td style="text-align:right;"> 19 </td> </tr> <tr> <td style="text-align:left;"> Europe </td> <td style="text-align:left;"> Portugal </td> <td style="text-align:left;"> Final Qualifier </td> <td style="text-align:right;"> 1 </td> <td style="text-align:left;"> Pool stage (2007) </td> <td style="text-align:right;"> 16 </td> </tr> <tr> <td style="text-align:left;"> Oceania </td> <td style="text-align:left;"> Australia </td> <td style="text-align:left;"> Top 3 in 2019 RWC pool </td> <td style="text-align:right;"> 9 </td> <td style="text-align:left;"> Champions (1991, 1999) </td> <td style="text-align:right;"> 9 </td> </tr> <tr> <td style="text-align:left;"> Oceania </td> <td style="text-align:left;"> Fiji </td> <td style="text-align:left;"> Top 3 in 2019 RWC pool </td> <td style="text-align:right;"> 8 </td> <td style="text-align:left;"> Quarter-finals (1987, 2007) </td> <td style="text-align:right;"> 7 </td> </tr> <tr> <td style="text-align:left;"> Oceania </td> <td style="text-align:left;"> New Zealand </td> <td style="text-align:left;"> Top 3 in 2019 RWC pool </td> <td style="text-align:right;"> 9 </td> <td style="text-align:left;"> Champions (1987, 2011, 2015) </td> <td style="text-align:right;"> 4 </td> </tr> <tr> <td style="text-align:left;"> Oceania </td> <td style="text-align:left;"> Samoa </td> <td style="text-align:left;"> Oceania 1 </td> <td style="text-align:right;"> 8 </td> <td style="text-align:left;"> Quarter-finals (1991, 1995) </td> <td style="text-align:right;"> 12 </td> </tr> <tr> <td style="text-align:left;"> Oceania </td> <td style="text-align:left;"> Tonga </td> <td style="text-align:left;"> Asia/Pacific 1 </td> <td style="text-align:right;"> 8 </td> <td style="text-align:left;"> Pool stage (eight times) </td> <td style="text-align:right;"> 15 </td> </tr> <tr> <td style="text-align:left;"> South America and North America Rugby </td> <td style="text-align:left;"> Argentina </td> <td style="text-align:left;"> Top 3 in 2019 RWC pool </td> <td style="text-align:right;"> 9 </td> <td style="text-align:left;"> Third place (2007) </td> <td style="text-align:right;"> 6 </td> </tr> <tr> <td style="text-align:left;"> South America and North America Rugby </td> <td style="text-align:left;"> Uruguay </td> <td style="text-align:left;"> Americas 1 </td> <td style="text-align:right;"> 4 </td> <td style="text-align:left;"> Pool stage (1999, 2003, 2015, 2019) </td> <td style="text-align:right;"> 17 </td> </tr> <tr> <td style="text-align:left;"> South America and North America Rugby </td> <td style="text-align:left;"> Chile </td> <td style="text-align:left;"> Americas 2 </td> <td style="text-align:right;"> 0 </td> <td style="text-align:left;"> Debut </td> <td style="text-align:right;"> 22 </td> </tr> </tbody> </table> </td> </tr> </tbody> </table> --- # Why web scraping could be bad? * Scraping increases web traffic. * People ignore and violate `robots.txt` and Terms of Service (ToS) of websites. * You should avoid those troubles by following these simple rules: 1. Read ToS of the website you want to scrap. 2. Inspect `robots.txt` (see <https://cran.r-project.org/robots.txt> for instance). 3. Use a reasonable frequency of requests (force your program to make some pauses). --- # Dynamic sites (advanced) * Sometimes, what you see in your browser is not what is returned by `read_html()`. In many cases, this is due to website that employs methods for dynamic data requests. * A solution is to simulate a browser to cope with dynamically rendered webpages. * _Selenium_ offers a solution. It is a project focused on automating web browsers. * You have access to Selenium with the `RSelenium` package. * An alternative is the `chromote` package (developped by Posit) that focuses on [Chrome DevTools Protocol](https://chromedevtools.github.io/devtools-protocol/). --- # World bank data <iframe src="https://data.worldbank.org/indicator/SP.ADO.TFRT" width="100%" height="400px" data-external="1"></iframe> --- # World bank data * Inspecting the "table". <img src="images/worlddata.png" width="600px" style="display: block; margin: auto;" /> --- # World bank data * Trying to fetch the data _non-dynamically_ using `class="item"`. ``` r url <- "https://data.worldbank.org/indicator/SP.ADO.TFRT" url %>% read_html() %>% html_nodes(".item") %>% html_text() # or html_nodes("div.item") ``` * Only the header is returned. --- # World bank data * A first dynamic solution with the `chromote` package. ``` r library(chromote) b <- ChromoteSession$new() # open a chromote session url <- "https://data.worldbank.org/indicator/SP.ADO.TFRT" b$Page$navigate(url) # navigate to the url b$Runtime$evaluate("document.querySelector('html').outerHTML")$result$value %>% read_html() %>% html_nodes(".item") %>% html_text() %>% head() b$close() # close the session ``` ``` ## [1] "CountryMost Recent YearMost Recent Value" ## [2] "Afghanistan202280" ## [3] "Albania202214" ## [4] "Algeria202212" ## [5] "American Samoa202230" ## [6] "Andorra20226" ``` --- # World bank data Some comments on the `chromote` command: * `b <- ChromoteSession$new()` create a new `ChromoteSession` object assigned to `b`. * `b$Page$navigate(url)` navigates to the provided URL. * The `Runtime$evaluate` command tells the browser to run JavaScript code. * The JavaScript code `document.querySelector('html').outerHTML` selects the <html> element from the current web page's Document Object Model (DOM), and then retrieves its entire HTML content, including the element itself and everything inside it. * Essentially, it captures the entire structure of the HTML document, from the opening <html> tag to the closing </html> tag, as a string. * Notice that the browser can be viewed using `b$view()` * Check the package [site](https://github.com/rstudio/chromote) for more info. --- # World bank data * `chromote` is for `Chrome`, `Chromium` and the likes. `Selenium` is more general. * Unfortunately, the solution using `RSelenium` is currently not working properly. But here is how a possible implementation would look like. ``` r rD <- rsDriver(browser="firefox", port=4545L, verbose=F) remDr <- rD[["client"]] remDr$navigate(url) html_page <- remDr$getPageSource()[[1]] html_page %>% read_html() %>% html_nodes(".item") %>% html_text() ``` --- # Regular Expressions in R - Regular expressions (regex) are patterns used to match character combinations in strings. In R, they are particularly useful for extracting or replacing parts of text data. --- # Using Regex in R - R has built-in functions to work with regular expressions: - `grep()`: Search for matches of a pattern in a character vector. - `grepl()`: Returns a logical vector indicating if there is a match. - `sub()`, `gsub()`: Replace the first or all occurrences of a pattern in a string. - `regexpr()`, `gregexpr()`: Find the position and length of matches. - `str_extract()` and `str_replace()` from `stringr` package for a tidy approach. --- # Basics of Regular Expressions - Common symbols used in regular expressions: - `.`: Any single character except newline. - `*`: Zero or more repetitions of the preceding character. - `+`: One or more repetitions of the preceding character. - `?`: Zero or one repetition of the preceding character. - `[]`: A set of characters. For example, `[abc]` matches 'a', 'b', or 'c'. - `^`: In `[]`, it inverts the match. For example, `[^abc]` matches everything except 'a', 'b', or 'c'. - `^`: Matches the start of a string. - `$`: Matches the end of a string. - `-`: Defines a range of characters. For example, `[a-z]` matches any lowercase letter. - `\\`: Escape character. --- # Useful pairs of characters - `\\d`: Any digit. - `\\D`: Any non-digit. - `\\w`: Any word character (alphanumeric + underscore). - `\\W`: Any non-word character. - `\\s`: Any whitespace character. - `\\S`: Any non-whitespace character. Quantifiers: - `{n}`: Exactly n repetitions. - `{n,}`: At least n repetitions. - `{n,m}`: Between n and m repetitions. --- # Example: Extracting Data with Regex ``` r library(stringr) text <- "John's email is john.doe@example.com and Jane's email is jane_doe123@example.org" # Extract all email addresses # chatGPT solution: # emails <- str_extract_all(text, "[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\\.[a-zA-Z]{2,}") emails <- str_extract_all(text, "\\S+@\\S+") print(emails) ``` ``` ## [[1]] ## [1] "john.doe@example.com" "jane_doe123@example.org" ``` --- # Example: Cleaning Text with Regex - Use `gsub()` to clean text data by removing unwanted characters. ``` r # Replace all non-alphanumeric characters with a space # chatGPT solution: # clean_text <- gsub("[^a-zA-Z0-9\\s]", " ", "Hello, World! Welcome to R programming.") clean_text <- gsub("\\W", " ", "Hello, World! Welcome to R programming.") print(clean_text) ``` ``` ## [1] "Hello World Welcome to R programming " ``` --- class: sydney-blue, center, middle # Question ? .pull-down[ <a href="https://ptds.samorso.ch/"> .white[<svg viewBox="0 0 384 512" style="height:1em;position:relative;display:inline-block;top:.1em;" xmlns="http://www.w3.org/2000/svg"> <path d="M369.9 97.9L286 14C277 5 264.8-.1 252.1-.1H48C21.5 0 0 21.5 0 48v416c0 26.5 21.5 48 48 48h288c26.5 0 48-21.5 48-48V131.9c0-12.7-5.1-25-14.1-34zM332.1 128H256V51.9l76.1 76.1zM48 464V48h160v104c0 13.3 10.7 24 24 24h104v288H48z"></path></svg> website] </a> <a href="https://github.com/ptds2023/"> .white[<svg viewBox="0 0 496 512" style="height:1em;position:relative;display:inline-block;top:.1em;" xmlns="http://www.w3.org/2000/svg"> <path d="M165.9 397.4c0 2-2.3 3.6-5.2 3.6-3.3.3-5.6-1.3-5.6-3.6 0-2 2.3-3.6 5.2-3.6 3-.3 5.6 1.3 5.6 3.6zm-31.1-4.5c-.7 2 1.3 4.3 4.3 4.9 2.6 1 5.6 0 6.2-2s-1.3-4.3-4.3-5.2c-2.6-.7-5.5.3-6.2 2.3zm44.2-1.7c-2.9.7-4.9 2.6-4.6 4.9.3 2 2.9 3.3 5.9 2.6 2.9-.7 4.9-2.6 4.6-4.6-.3-1.9-3-3.2-5.9-2.9zM244.8 8C106.1 8 0 113.3 0 252c0 110.9 69.8 205.8 169.5 239.2 12.8 2.3 17.3-5.6 17.3-12.1 0-6.2-.3-40.4-.3-61.4 0 0-70 15-84.7-29.8 0 0-11.4-29.1-27.8-36.6 0 0-22.9-15.7 1.6-15.4 0 0 24.9 2 38.6 25.8 21.9 38.6 58.6 27.5 72.9 20.9 2.3-16 8.8-27.1 16-33.7-55.9-6.2-112.3-14.3-112.3-110.5 0-27.5 7.6-41.3 23.6-58.9-2.6-6.5-11.1-33.3 2.6-67.9 20.9-6.5 69 27 69 27 20-5.6 41.5-8.5 62.8-8.5s42.8 2.9 62.8 8.5c0 0 48.1-33.6 69-27 13.7 34.7 5.2 61.4 2.6 67.9 16 17.7 25.8 31.5 25.8 58.9 0 96.5-58.9 104.2-114.8 110.5 9.2 7.9 17 22.9 17 46.4 0 33.7-.3 75.4-.3 83.6 0 6.5 4.6 14.4 17.3 12.1C428.2 457.8 496 362.9 496 252 496 113.3 383.5 8 244.8 8zM97.2 352.9c-1.3 1-1 3.3.7 5.2 1.6 1.6 3.9 2.3 5.2 1 1.3-1 1-3.3-.7-5.2-1.6-1.6-3.9-2.3-5.2-1zm-10.8-8.1c-.7 1.3.3 2.9 2.3 3.9 1.6 1 3.6.7 4.3-.7.7-1.3-.3-2.9-2.3-3.9-2-.6-3.6-.3-4.3.7zm32.4 35.6c-1.6 1.3-1 4.3 1.3 6.2 2.3 2.3 5.2 2.6 6.5 1 1.3-1.3.7-4.3-1.3-6.2-2.2-2.3-5.2-2.6-6.5-1zm-11.4-14.7c-1.6 1-1.6 3.6 0 5.9 1.6 2.3 4.3 3.3 5.6 2.3 1.6-1.3 1.6-3.9 0-6.2-1.4-2.3-4-3.3-5.6-2z"></path></svg> GitHub] </a> ] <!-- --- --> <!-- # Exercises --> <!-- 1. Play with [CSS Diner](https://flukeout.github.io/) to get familiar with CSS Selectors. --> <!-- 2. Follow this [workflow](https://smac-group.github.io/ds/section-web-scraping.html#section-workflow). It uses the _SelectorGadget_. Propose an alternative solution using CSS selectors. You will probably need to use the developer tools of your browser. --> <!-- 3. Repeat exercise 2. using `RSelenium` or `chromote`. --> <!-- 4. Extract the information from the World bank data example using regular expressions. --> --- # To go further * More details and examples in the book [An Introduction to Statistical Programming Methods with R](https://smac-group.github.io/ds/section-web-scraping.html) * <https://github.com/yusuzech/r-web-scraping-cheat-sheet/> * Want to build your own R API wrapper? Have a look at <https://colinfay.me/build-api-wrapper-package-r/> and <https://httr2.r-lib.org/articles/wrapping-apis.html> * [Datacamp](https://www.datacamp.com/courses/web-scraping-in-r) class on webscraping with R * [Automated Data Collection with R: A Practical Guide to Web Scraping and Text Mining](https://www.wiley.com/en-us/Automated+Data+Collection+with+R%3A+A+Practical+Guide+to+Web+Scraping+and+Text+Mining-p-9781118834817) * See also the chapters on [webscraping](https://r4ds.hadley.nz/webscraping) and [regular expression](https://r4ds.hadley.nz/regexps) of R for Data Science. * W3School for [HTML](https://www.w3schools.com/html/default.asp) and [CSS](https://www.w3schools.com/css/default.asp). --- # Resources for Learning Regex - [Regular Expressions in R](https://r4ds.hadley.nz/regexps) - Chapter from R for Data Science. - [regex101](https://regex101.com/) - An interactive regex tester for experimenting with patterns. - [Stringr Package](https://stringr.tidyverse.org/) - Provides functions to simplify regex usage in R.