>

애플 스토어에서 정보를 긁어 내야합니다. 해시 맵 hashmap_genre_link 가 있습니다.  장르와 URL ({ 'Games': ' https : //itunes.apple.com/us/genre/ios-games/id6014?mt=8 ';...}), 각 키마다 iOS 앱 (텍스트)으로 다른 해시 맵을 만들고 싶습니다. 앱 URL 값 : games_apps : { 'Pokemon Go', ' https://itunes.apple.com/us/app/pokémon-go/id1094591345?mt=8 ': ...}.

내 코드는 다음과 같습니다 :

from bs4 import BeautifulSoup
from requests import get
links = []
ios_categories_links=[]
hashmap_genre_link ={}
url = "https://itunes.apple.com/US/genre/ios/id36"
response = get(url)
html_soup = BeautifulSoup(response.text,"html.parser")
categories_class = html_soup.find_all('div',class_="grid3-column")
# cat = categories_class.text
href = html_soup.find_all('a', href=True)
for j in href:
    # print(j['href'])
    links.append(j['href'])
#
# Hasmap initialisation : hashmap_genre_link = {"games" : "https://link_for_games_page"; etc...}
for i in links:
    if "https://itunes.apple.com/us/genre/ios" in i:
        genre = i.split("/")[5][4:] #We get the genre, without 'ios-'
        hashmap_genre_link[genre] = i
        ios_categories_links.append(i)
#print(hashmap_genre_link)

for the_key, the_value in hashmap_genre_link.items():
    #print(the_key, 'corresponds to', the_value)
    print("=======================")
    print(the_key)
    response_genre_link = get(the_value)
    html_soup_genre_link = BeautifulSoup(response_genre_link.text,"html.parser")
    genre_popular_apps_class = html_soup_genre_link.find_all('div',class_="grid3-column")
    for x in genre_popular_apps_class:
        print(x['href'])

출력의 일부입니다 :

=======================
games-family
<div class="grid3-column" id="selectedcontent">
<div class="column first">
<ul>
<li><a href="https://itunes.apple.com/us/app/trivia-crack/id651510680?mt=8">Trivia Crack</a> </li>
<li><a href="https://itunes.apple.com/us/app/minion-rush/id596402997?mt=8">Minion Rush</a> </li>
<li><a href="https://itunes.apple.com/us/app/draw-something-classic/id488628250?mt=8">Draw Something Classic</a> </li>

href 태그를 값으로 얻는 방법 (내가 아는 텍스트에는 .text를 사용할 수 있습니다


  • 답변 # 1

    당신은 ['href'] 에 대한 올바른 생각을 가지고  해당 속성 값을 가져옵니다. 그러나이를 분리해야합니다. 당신의 x  요소에는 <a> 의 모든 href가 포함됩니다.  태그. 그래서 당신은 추가 x.find_all('a') 를해야합니다 그런 다음 반복하여 각 href 를 인쇄하십시오.  그 <a> 각각에 대한 속성  태그.

    그래서 내가 추가 한 것 :

    for x in genre_popular_apps_class:
            alpha = x.find_all('a')   
            for beta in alpha:
                print (beta['href'])
    
    

    전체 코드 :

    from bs4 import BeautifulSoup
    from requests import get
    links = []
    ios_categories_links=[]
    hashmap_genre_link ={}
    url = "https://itunes.apple.com/US/genre/ios/id36"
    response = get(url)
    html_soup = BeautifulSoup(response.text,"html.parser")
    categories_class = html_soup.find_all('div',class_="grid3-column")
    # cat = categories_class.text
    href = html_soup.find_all('a', href=True)
    for j in href:
        # print(j['href'])
        links.append(j['href'])
    #
    # Hasmap initialisation : hashmap_genre_link = {"games" : "https://link_for_games_page"; etc...}
    for i in links:
        if "https://itunes.apple.com/us/genre/ios" in i:
            genre = i.split("/")[5][4:] #We get the genre, without 'ios-'
            hashmap_genre_link[genre] = i
            ios_categories_links.append(i)
    #print(hashmap_genre_link)
    results_dict = {}
    for the_key, the_value in hashmap_genre_link.items():
        #print(the_key, 'corresponds to', the_value)
        print("=======================")
        print(the_key)
        response_genre_link = get(the_value)
        html_soup_genre_link = BeautifulSoup(response_genre_link.text,"html.parser")
        genre_popular_apps_class = html_soup_genre_link.find_all('div',class_="grid3-column")
        for x in genre_popular_apps_class:
            alpha = x.find_all('a')
            links = [ beta['href'] for beta in alpha ]
        results_dict[the_key] = links
    
    

    출력 :

    ....
    =======================
    games-racing
    https://itunes.apple.com/us/app/bike-race-free-style-games/id510461758?mt=8
    https://itunes.apple.com/us/app/hill-climb-racing/id564540143?mt=8
    https://itunes.apple.com/us/app/csr-racing/id469369175?mt=8
    https://itunes.apple.com/us/app/real-racing-3/id556164008?mt=8
    https://itunes.apple.com/us/app/asphalt-8-airborne/id610391947?mt=8
    https://itunes.apple.com/us/app/csr-racing-2/id887947640?mt=8
    https://itunes.apple.com/us/app/smashy-road-wanted/id1020119327?mt=8
    https://itunes.apple.com/us/app/happy-wheels/id648668184?mt=8
    https://itunes.apple.com/us/app/angry-birds-go/id642821482?mt=8
    https://itunes.apple.com/us/app/need-for-speed-no-limits/id883393043?mt=8
    ...
    
    

  • 이전 java - 클래스 경로에서 와일드 카드를 사용하여 여러 항아리를 추가하는 방법은 무엇입니까?
  • 다음 다중 인증 한 페이지 로그인 laravel 사용