🚑 👶🏾 👨🏾‍🤝‍👨🏽 关于俄罗斯宪法修正案的投票异议。第2部分 👃 🏽 👲

链接到第一部分

第二部分的主要目标是使用特定示例详细研究大规模绘图（发明）投票结果的现象。

与第一部分一样，所有计算，可视化和数据解析都在Google Colab中提供，可从此Google Colab链接中获得。

为何大规模分析选举数据很重要？

95 . .

-, 14 . .

-, ( ) 78 . - 8.8 . -.

, RUElectionData "". . .

— , . . , IT - .

, “” () . , .

“” , . . , .

:

, . 327 19 , 25 . 19 . .
“” . , , , 2014 2014 .
2000-2018 .

. , . . .

10 46 12 91.0 10 90.0 ( ):

43 43 79%, 78%, 75%, 74.8% 74% ( ):

60 , 72 21 '' 79.0%, 5 78.0% 4 78.9% ( ):

76 № 2 85 14 '' 75.0% 8 74.9% ( ):

22 , ‘’ 80%, 79%, 78%, 77% 75% ( ):

, , . , , SOS .

, . () ( ):

, -, '' '' ( ):

, 71.1% 85.6%. (duplicates) - ds. min_n_duplicates, . total_duplicates . get_duplicates:

def get_duplicates(dq,col_name='yes_pct',min_n_duplicates=3):
       ds=pd.DataFrame(dq[col_name].values.round(1), 
       columns= [col_name]).groupby(col_name).size().to_frame('size') 
       ds=ds[ds['size']>=min_n_duplicates].sort_values(ascending=False,by='size')
       total_duplicates=ds['size'].sum()
       ds.reset_index(level=0, inplace=True)
       return total_duplicates,ds

ds 5 №5 (size () ):

0.1%, . - dr : ( ) total_duplicates, pct_duplicates prob_duplicates.

, , .

, 5%. , 9%, ‘’ 7%.

n_levels=50 (10 ). n_stations=40 , n_identicals=10 , :

def get_p(n_identicals=10,n_stations=40,n_levels=50):
    bin_coeff=special.binom(n_stations, n_identicals)              
    prob=bin_coeff*(1/n_levels)**n_identicals*
   ((n_levels-1)/n_levels)**(n_stations-n_identicals)  
    return prob

, . c get_prob_duplicates:


from scipy.special import factorial
def multinomial_coeff(c): return factorial(c.sum()) / factorial(c).prod()

def get_prob_duplicates(duplicates=[10,5],n_stations=40,n_levels=50):
    n_duplicates=len(duplicates)
    sum_duplicates=sum(duplicates)
    coeffs = np.array(duplicates+[n_stations-sum_duplicates])
    mc=multinomial_coeff(coeffs)  
    prob=mc*(n_levels-n_duplicates)**(n_stations-sum_duplicates)/
    n_levels**n_stations
    return prob

. , .

, . , 0.1%.

, n_levels n_stations, , .

“” .

. .

(numpy.round(1)), 0 9. plot_first_digit . :

‘’ x-np.floor(x) :

yes_pct.apply(lambda x: x-np.floor(x)).hist(bins=25,grid=True)

.

x-np.round(x) :

yes_pct.apply(lambda x: x-np.round(x)).hist(bins=25,grid=True)

, . , ( -!). .

82.0% ( ), N=1021. 0.82*1021=838.2, 838. 838/1021=81.97%.

, ±1/(2N) . , . . .

, . ‘’ 100%-’’. . , .

,

. . . .

, , 2014 , 2014 , 2014 , 16 2014. . .

“”.

, , , 3 2014 . wikipedia :

-: 10 319 723 (88,7%)

-: 500 279 (4,3%)

: 372 301 (3,2%)

: 15 845 575

: 11 634 412

: 442 108 (3,8%)

:

10 , 0,00001%. , 88.70000% ~ 1/10000.

10 319 723/11 634 412 100=88.70000 %

372 301/11 634 412100=3.10000 %

442 108/11 634 412*100=3.80000%

(1/10000)^3 = 10^(-12).

, 16 2014 (wikipedia).

: 306 258

: 274 101

« »: 262 041

274 101 306 258 — 274101/306258 = 89.500%, «» 262041 274 101 95.600%.

, , .

~ 300 000, 0,001%. , 89,500% 95,600% (1/100)^2 = 0.0001, 10 0.001 .

-.

. ( ) ‘’ ( “”) . - . . — .

, 2000 2020 2008 .

. , , . , .

. , . , () . , .

: , .

, . -, . , .

- , , . .

, , . , .

1: 50% 371 . (c )

2: 260 33 , - ( ) . 259 .

№ 259 ( 32%, "" 50.79%, "" 48.37%),№ 260 (33, 44.48, 55.11 )

(64.84%) 260 (33.5%). =64.8%-33.5=31%.

, , 259 260 ( ) , .

3: , 1108 850/1219=70%.

, 482/1219=40%. , =70-40=30%. , 7.36, « . . ».

, ( ) . , 30% 50% - - /, . , , .

, 80 90 55 . “”:"Our power comes from the perception of our power".

IT , - . ", ".

: « , , , » III; " , , ; , — » ; « , — » .

, , . , - ( ), .

?

, .

, c № 2236 . -. , 99% – «».

. , . .

, , , . .

. , 2020 , .

, , .

, . , “ ”. .

数据的开放性和可访问性以及分析结果的可重复性很重要。在发布本文的两部分时，我完全追求了这一目标。如果读者不同意结论或不信任根据其解释数据的数学模型，那么他可以使用给定的数据和代码来构建自己的模型。

关于俄罗斯宪法修正案的投票异议。第2部分

为何大规模分析选举数据很重要？

:

,

-.

?

More articles: