728x90
[C# / vb.net] HtmlAgilityPack 라이브러리를 이용한 크롤링시 라이브러리 예제
Imports HtmlAgilityPack
Dim doc As New HtmlDocument, a As HtmlNode, img As HtmlNode, hNode As HtmlNode
doc.LoadHtml(T)
If Not doc.DocumentNode.SelectNodes("//*[@id=""productList""]/li") Is Nothing Then
For Each li As HtmlNode In doc.DocumentNode.SelectNodes("//*[@id=""productList""]/li")
a = li.SelectSingleNode("a")
T = a.SelectSingleNode("dl/dd/div[2]").InnerHtml.Replace(vbLf, Space(1)).Trim
T = a.Attributes("data-item-id").Value
T = a.Attributes("data-is-rocket").Value = "true"
T = "https://www.coupang.com" & a.Attributes("href").Value.Replace("&", "&")
img = a.SelectSingleNode("dl/dt/img")
T = "https:" & img.Attributes("src").Value
If Not img.Attributes("data-img-src") Is Nothing Then
T = "https:" & img.Attributes("data-img-src").Value
Else
Application.DoEvents()
End If
If Not a.SelectSingleNode("//*[@id=""searchOptionForm""]/div/div/div[1]/div/div[1]/h3") Is Nothing Then
T = doc.DocumentNode.SelectSingleNode("//*[@id=""searchOptionForm""]/div/div/div[1]/div/div[1]/h3").InnerText.Replace(vbTab, Space(1)).Replace(vbLf, Space(1)).Split("("c).First.Trim
End If
T = a.SelectSingleNode("dl/dd/div[3]/div[1]/div[1]/em/strong").InnerText
hNode = GetClassNode(a.SelectSingleNode("dl/dd/div[3]/div[1]/div[1]/span[1]"), "span", "discount-percentage")
next
End If
Private Function GetClassNode(ByVal Node As HtmlNode, ByVal tagName As String, ByVal className As String) As HtmlNode
Try
Dim ND As List(Of HtmlNode) = Node.Descendants(tagName).Where(Function(k) k.Attributes.Contains("class") AndAlso k.Attributes("class").Value.Contains(className)).ToList
If ND.Count > 0 Then Return ND.First
Catch ex As Exception
End Try
Return Nothing
End Function
728x90
'자료' 카테고리의 다른 글
다음팟인코더 구버전 무료 다운로드 설치하기 (최종 v2.1.4.62) (0) | 2020.09.25 |
---|---|
구글 애드센스 상단에 2개 적용하는 방법 (0) | 2020.09.25 |
[C#/vb.net] DataTable 내용을 XML로 저장 및 불러오기 (0) | 2020.09.24 |
[C#/vb.net] DateTimePicker 표시형식 변경 (0) | 2020.09.24 |
[C#/vb.net] 네이버 맞춤법 검사 (WinHttp이용) (0) | 2020.09.24 |