With the development of communication technology and the general trend of interconnection and intercommunication, people’s demand for high-quality information and communication is increasing day by day. Therefore, the intensive deployment of base stations has become the development trend of the construction of a new generation of communication network. In this paper, in order to solve the urban and rural electric scenarios such as traditional transport unreachable problem, on the basis of the operator site cost at the same time, keep the quality of signal transmission, give full play to the advantages of the traditional planning concept and artificial intelligence algorithm, integrated “pieces”, “step by step optimization”, “local optimal”, “the results of reverse”, such as ideas, set up a wireless back mathematical analysis model of network topology programming problem.

The purpose of this paper is to reduce the cost and reduce the return path loss. Aiming at the lower cost problem, a local optimal model is constructed. First division, through the K - Means algorithm is divided into K, for each region is also based on K - Means algorithm is further split into its n tribal groups, limited each tribal group one and only one host station, recently to tribal group of perimeter centroid position of butterfly stand stand as a host, the rest of the site as a child, according to the type of tribal group of judge whether the number of sub station meet the qualification, if not satisfied, change the K value and the adjustable radius, respectively, until meet the qualification, the best solution is calculated. For the lower return path loss problem, if only the loss problem is considered, the loss result is obtained by changing the distance between the first jump and the second jump based on the model. Through the comparison of the results, the best scheme is obtained when only loss is considered.

With the rapid development of mobile communication network, various mobile terminal devices and applications affect every aspect of people’s life. However, the current network information construction faces some problems: the acceleration of urban construction makes the urban environment more and more complex, which leads to the formation of many wireless signal black spots and weak coverage areas in densely populated urban areas; Some urban residents misunderstand the construction of base stations, believing that the harm of base stations into the community is serious, which makes the deployment of base stations significantly more difficult, and the phenomenon of station demolition is also increasing. Due to the difficulties in property coordination in many communities, the arrival rate of the last kilometer of transmission fiber deployment is low.

In literature [

The location distribution of candidate sites in this paper is known, and there are 1000 sites. Only the mutual location and topological relationship between sites are considered, that is, only the distance between the host station and the sub-station is considered. The integrated cost of various station types, including the integrated cost of the host station, the integrated cost of the sub-station, and the cost of satellite equipment are also known.

Total cost = host station cost * number of host stations + substation cost * number of substations + satellite cost * number of satellites

Average cost = total cost/number of sites in the region

The topological relationship between sites meets the following conditions:

The distance between the host station and the first-level sub-station is no more than 20km, and the distance between sub-stations is no more than 10km.

The site is divided into two types: RuraStar (one sector) and butterfly station (two sectors).

If it is the host station, the maximum number of sub-stations at level 1 in each sector is 4, and the total number of sub-stations does not exceed 6.

Regardless of the coverage direction of butterfly station sector;

The limit of microwave communication distance between host stations is 50km.

Wireless return connection is adopted between host station and sub-station and between sub-stations;

Each sub-station can only have two wireless back links at most, that is, the upstream and downstream links are unique;

The relation diagram between host station and substation is similar to the tree diagram.

There is only one path between any substation and the host station, and the number of hops is less than 3.

There is a one-to-many relationship between the satellite and the host station. A host station group with less than 8 satellites can share a satellite.

The wireless back propagation is affected by the NLOS scene, and the free space propagation is adopted to simplify the calculation.

The model estimates the path loss between sites, and the formula is as follows:

Where, PL stands for path loss, D is the distance between the two stations, the unit is km, F is the transmission frequency, the default is a constant, 900MHZ.

Average system loss = sum of all wireless return connection losses/number of wireless return connections

Schematic diagram of connections between sites

Suppose RRN (eRelay Remote Node) wireless transmission device as a substation;

It is assumed that DeNB, as the host station, can be divided into 1 to 3 host cells, covering different directions.

It is assumed that the effects of terrain blocking and ordinary mobile phone access blocking on the back transmission quality are not considered.

It was assumed that ReBTS interference was not considered.

The interference of adjacent base stations is not considered.

It is assumed that the first jump of cascade between base stations is not greater than 20km, and then it is not greater than 10km.

Assuming that the sector coverage direction of butterfly station is not taken into account, the maximum number of sub-stations is 12.

It is assumed that the maximum number of RuralStar stations is 4.

It is assumed that the host stations are connected by microwave and the maximum communication distance is 50KM.

Suppose that the host station and the sub-station and the sub-station are connected by wireless back transmission.

It is assumed that a substation can only have one host station, and there is only one path to the host station, and the path contains no more than 3 hops.

It is assumed that there can be no more than 2 wireless return connections between each substation.

Assuming that only one satellite of any host station is responsible for the back transmission, host stations connected by slices can share one satellite, and each satellite can only bear the back transmission data of 8 host stations.

It is assumed that the total number of host stations is unlimited.

Assuming that the maintenance costs of the host station, substation and satellite are not taken into account.

Assuming that other factors are not considered, the spherical model is transformed into a plane model.

It is assumed that the path consumption is estimated by the free space model without considering the influence of NLOS.

NOTATION TABLE

symbols | Meaning |
---|---|

LOS | Line-of-sight transmission capability |

ROI | Return on investment |

NLOS | Non-line-of-sight transmission capability |

RRN | The infinite return device of the infinite return scheme |

RN | Relay station |

UE | Ordinary mobile phone |

PL | Path to the consumption |

APL | Average path consumption |

D | Site spacing |

F | Transmission frequency |

R | Radius of the earth |

S | The distance between the spheres |

αi | I point longitude |

β i | I point the dimension |

i | Base station identification |

Xi | The host site |

Yi | The child site |

Zi | Satellite point |

C | The overall cost |

FD | The first jump distance |

ND | After each jump distance |

FXi | Butterfly host station |

RXi | Star host station |

WL | Microwave connections between host stations |

WBL | Wireless return connection between host station and sub-station and between sub-stations |

JS | The number of hops from a substation to a host station |

Ceil | The function that rounds up |

In combination with table

SHOWS THE COSTS OF VARIOUS MODES OF TRANSMISSION IN WUSD

transport | cost |
---|---|

host station | 10 |

child station | 5 |

satellite | 50 |

For path loss, only consider the child stand back part of the path loss, and not to consider path loss between host station, just meet the distance limit, so can be achieved by increasing the number of the host station to smaller path loss, when the host stand for most time, path loss minimum, but also will increase the cost of satellite transmission. If the effective distance of wireless return transmission is limited, that is, the distance between the substation and the host station and between the substations is limited, then in theory, a low path loss can be obtained. By means of the idea of “result inverse”, the model is modified, and then the modified station model is deduced, and the lowest cost scheme, namely the optimal solution of the problem, is finally screened out.

Total cost: number of host stations * host station cost + number of substations * substation cost + number of satellites * satellite cost.

The number of satellites is equal to Ceil (number of host stations/8), and Ceil () indicates the upward direction.

Where: C represents the total cost under the topology;

X represents the number of host stations;

Y represents the number of sub-stations;

Z is the number of satellites.

The distance of the first hop is 20km, and that of each subsequent hop is 10km.

Site includes RuralStar and butterfly station two different station type; Among them, RuralStar contains 1 sector in total, and butterfly station contains 2 sectors in total. If the site is the host station, the maximum number of sub-stations at the first level of each sector is 4, and the maximum total number of sub-stations is 6. In order to simplify the problem, the sector coverage direction of butterfly station is not considered for the moment.

Microwave connection is adopted between the host stations, and the maximum communication distance is 50km.

Wireless return connection is adopted between host station and sub-station and between sub-stations.

Each sub-station can only have two wireless back links at most.

Any sub-station can only belong to one host station, and there is only one path to the host station, and the number of hops contained in the path is less than or equal to 3.

Any host station has and only one satellite responsible for the back transmission. Host stations connected by slices can share the same satellite, but a satellite can only bear the back data of eight host stations at most.

In a monolithic host station, there is no upper limit for the total number of host stations, i.e., the constraint conditions are as follows:

Based on the above analysis, a lower overall cost model with the highest priority is established as follows:

Clustering analysis is an important analysis method in data mining. Its goal is to divide the data set into several clusters, so that the similarity of data points within the same cluster is as large as possible, while the similarity of data points between different clusters is as small as possible. The study of Clustering Algorithms has a long history. Hartigan systematically discussed Clustering Algorithms in his monograph Clustering Algorithms as early as 1975. Since then, the academic circle has proposed a variety of clustering algorithms based on different ideas, mainly including the algorithm based on partition, the algorithm based on hierarchy, the algorithm based on density, the algorithm based on grid and the algorithm based on model. All these algorithms can achieve good clustering effect, among which the k-means algorithm based on partition is the one that is applied most and has a simple algorithm idea. By dealing with the difficult constraints, k-means algorithm makes the solution of the problem relatively easy. The algorithm has a good convergence, and the solution speed of the algorithm can also meet the requirements of real-time.

For a randomly given set of 1000 sites, the samples are divided into K clusters according to the size of the distance between sites. Suppose the cluster is divided into (_{1}, _{2},…_{k}

Where _{i}_{i}

A suitable K value range of 10~25 was selected through cross validation. That is, k samples are randomly selected from 100 data sets as the initial k centroid vectors: {_{1}, _{2},…_{k}

Take N as the maximum number of iterations. For N =1,2,3… N.

Classify the cluster B and initialize it as _{t}

For I =1,2… M, calculate sample _{i}_{1} (j=1,2…K) distance from: _{i}_{ij}_{i}_{λi}_{λi}_{i}

For j = 1, 2,… K, recalculate the new center of mass _{j}

If all k centroid vectors do not change, go to step (

Output cluster division _{1}, _{2},…_{k}

The specific steps for solving the model are as follows:

Step 1: use k-means algorithm to conduct data aggregation and violent calculation on difficult constraints, and obtain the upper bound of the original problem and the next one, that is, the solution without considering other constraints will make the overall cost the lowest;

Step 2: by solving the lower overall cost model with the highest priority, k-means algorithm is used again to conduct data aggregation for the classified subregions, and considering the solutions of other constraints, the number of host stations in each subregion is obtained;

Step 3: if the obtained solution satisfies the optimal solution of the condition, stop the algorithm, add the number in each cell, and the obtained solution is the optimal solution of the problem, otherwise go to Step 2.

Obtain the topology structure satisfying the constraint conditions through Step 3. Take k=20 and the adjustable radius of the cell is 20km as an example, as shown in Figure

Base station topology

Step 4: obtain the minimum cost result output through the overall cost formula for the optimal solution obtained in each subregion.

Get the result of the lowest cost through Step 4. Take k=20 as an example, and the results are shown in figure 5-6. The number of host stations is 222, the number of sub stations is 778 and the number of satellites is 28. The total cost is 7510 (WUSD) and the average cost is 7.5100 (WUSD). The efficiency of the algorithm is less than 2 minutes, with strong convergence.

Program run result

By contrast the chosen radii are 20, 25, and 30. Get the corresponding site distribution, number and total cost. See table

COST COMPARISON TABLE

Radius (km) | Host station | child station | satellite | Overall cost (WUSD) |
---|---|---|---|---|

20 | 222 | 778 | 28 | 7510 |

25 | 136 | 858 | 17 | 6500 |

30 | 110 | 871 | 14 | 6155 |

By comparison, when the cell radius is 30, the overall cost is the lowest. The minimum cost is 6155 (WUSD). The larger the radius is, the more points of untreated stations are. However, even if the radius is 30, there are no 6 stations connected by leakage, which can be ignored.

According to the simulation result of cost, the connection relation and physical location between host station and sub station are obtained. If only the path loss of the back transmission part of the sub-station is taken into account, and the path loss between the host stations is not taken into account, it only needs to meet the distance limit, and a smaller path loss can be achieved by increasing the number of host stations. When X is the maximum, PL is the minimum, but it also increases the cost required by satellite transmission. If the first and second hop distances are limited, that is, the distance between the sub-station and the host station and between sub-stations is reduced to obtain a lower path loss. If there is only a level 1 sub-station, the number of sub-stations in a single sector of the host station is required to be less than or equal to 4, and then the modified sub-station model is deduced in reverse. Finally, the scheme with the lowest cost is selected, that is, the optimal solution to the problem.

Although there is an influence of non-line-of-sight transmission capability (NLOS) in wireless return transmission, in order to simplify the problem, the free-space transmission model is adopted to estimate the path loss between stations. The formula is as follows:

Where, PL is the path loss and the distance between the two stations, D is km, F is the transmission frequency and the unit is MHz, 900MHz is adopted by default here.

The average system cost in APL is equal to the sum of the losses of all wireless back links/the number of wireless back links

It should be noted that the path loss only considers the return part of the sub-stations, and the microwave transmission is adopted between the host stations. The loss is not calculated as long as the distance limit is satisfied.

By comparison, the first jump/second jump distance combinations were selected as 20/10, 15/8 and 12/6, respectively. Get the corresponding number of sites and average loss. See table

NUMBER OF SITES AND AVERAGE WASTAGE

First jump/second jump | Host station | child station | satellite | Average loss (dB) |
---|---|---|---|---|

20/10 | 222 | 778 | 28 | 111.79 |

15/8 | 385 | 610 | 49 | 109.44 |

12/6 | 588 | 410 | 74 | 107.30 |

Reducing the distance between the first jump and the second jump can reduce the average loss, so the 12/6 jump scheme is the best choice. However, the number of sites not connected to the topology increases, the signal coverage decreases, and the cost is very high.

To sum up, if only the cost is considered, the best solution is that when the adjustable constraint radius is 30km, the overall cost is the lowest, and the lowest cost is 6155 (WUSD). If only loss is considered, k=30, the first jump distance 12km and the second jump distance 6km are selected as the best scheme.

In this paper, in the process of achieving the goal, a local optimal model based on k-means algorithm is constructed. By traversal comparison of the site deployment of host station and sub-station under the global large clustering and local small clustering, the lowest cost sub-station scheme is screened out. The establishment of the model adopts the idea of “dividing the whole into zero” and applies the k-means algorithm. Due to the scalability and high efficiency of the algorithm itself, it can simply and quickly divide 1000 candidate sites into K parts for step-by-step solution, which significantly reduces the amount of calculation and improves the speed of operation. However, because the k value is predetermined, the selection of this k value is very difficult to estimate, and the one-time optimal programming cannot be realized. In addition, due to the clustering division of the overall site, the final solution of the model is always locally optimal, which may be slightly different from the overall optimal solution. Therefore, the initial value of k is limited in the modeling process. Let k be between 15 and 25, and solve different k values for many times, so as to obtain the relative optimal solution by comparison. After getting the lower cost of the substation scheme, considering the return path loss of the substation, the established local optimal model is modified from the perspective of reducing the loss. Considering the results, microwave connection is adopted between the host stations and no loss is calculated. The average loss of the system is related to the wireless return distance, so the algorithm efficiency is significantly improved by limiting the distance between stations. However, the limitation of this model lies in the increase of host station and the increase of overall cost. Although it is beneficial to the service quality of users, it increases the cycle of investment return, which is not in line with the original intention of operators. In order to take into account the interests of operators, follow-up can be adjusted according to the situation.