skywalking监控接入

zhx1994
zhx1994 2019-04-29 11:02
阅读需:0

下载最新的6.0windows版本

http://skywalking.apache.org/downloads/

解压后在config目录下修改相关文件配置

application.yml配置如下:

cluster:

standalone:

# Please check your ZooKeeper is 3.5+, However, it is also compatible with ZooKeeper 3.4.x. Replace the ZooKeeper 3.5+

# library the oap-libs folder with your ZooKeeper 3.4.x library.

#  zookeeper:

#    nameSpace: ${SW_NAMESPACE:""}

#    hostPort: ${SW_CLUSTER_ZK_HOST_PORT:localhost:2181}

#    #Retry Policy

#    baseSleepTimeMs: ${SW_CLUSTER_ZK_SLEEP_TIME:1000} # initial amount of time to wait between retries

#    maxRetries: ${SW_CLUSTER_ZK_MAX_RETRIES:3} # max number of times to retry

#  kubernetes:

#    watchTimeoutSeconds: ${SW_CLUSTER_K8S_WATCH_TIMEOUT:60}

#    namespace: ${SW_CLUSTER_K8S_NAMESPACE:default}

#    labelSelector: ${SW_CLUSTER_K8S_LABEL:app=collector,release=skywalking}

#    uidEnvName: ${SW_CLUSTER_K8S_UID:SKYWALKING_COLLECTOR_UID}

#  consul:

#    serviceName: ${SW_SERVICE_NAME:"SkyWalking_OAP_Cluster"}

#     Consul cluster nodes, example: 10.0.0.1:8500,10.0.0.2:8500,10.0.0.3:8500

#    hostPort: ${SW_CLUSTER_CONSUL_HOST_PORT:localhost:8500}

core:

default:

restHost: 127.0.0.1

restPort: 12800

restContextPath: /

gRPCHost: 127.0.0.1

gRPCPort: 11800

downsampling:

- Hour

- Day

- Month

# Set a timeout on metric data. After the timeout has expired, the metric data will automatically be deleted.

recordDataTTL: ${SW_CORE_RECORD_DATA_TTL:90} # Unit is minute

minuteMetricsDataTTL: ${SW_CORE_MINUTE_METRIC_DATA_TTL:90} # Unit is minute

hourMetricsDataTTL: ${SW_CORE_HOUR_METRIC_DATA_TTL:36} # Unit is hour

dayMetricsDataTTL: ${SW_CORE_DAY_METRIC_DATA_TTL:45} # Unit is day

monthMetricsDataTTL: ${SW_CORE_MONTH_METRIC_DATA_TTL:18} # Unit is month

storage:

# h2:

#   driver: ${SW_STORAGE_H2_DRIVER:org.h2.jdbcx.JdbcDataSource}

#   url: ${SW_STORAGE_H2_URL:jdbc:h2:mem:skywalking-oap-db}

#   user: ${SW_STORAGE_H2_USER:sa}

#  elasticsearch:

#    # nameSpace: ${SW_NAMESPACE:""}

#    clusterNodes: ${SW_STORAGE_ES_CLUSTER_NODES:localhost:9200}

#    indexShardsNumber: ${SW_STORAGE_ES_INDEX_SHARDS_NUMBER:2}

#    indexReplicasNumber: ${SW_STORAGE_ES_INDEX_REPLICAS_NUMBER:0}

#    # Batch process setting, refer to https://www.elastic.co/guide/en/elasticsearch/client/java-api/5.5/java-docs-bulk-processor.html

#    bulkActions: ${SW_STORAGE_ES_BULK_ACTIONS:2000} # Execute the bulk every 2000 requests

#    bulkSize: ${SW_STORAGE_ES_BULK_SIZE:20} # flush the bulk every 20mb

#    flushInterval: ${SW_STORAGE_ES_FLUSH_INTERVAL:10} # flush the bulk every 10 seconds whatever the number of requests

#    concurrentRequests: ${SW_STORAGE_ES_CONCURRENT_REQUESTS:2} # the number of concurrent requests

mysql:

receiver-register:

default:

receiver-trace:

default:

bufferPath: ${SW_RECEIVER_BUFFER_PATH:../trace-buffer/}  # Path to trace buffer files, suggest to use absolute path

bufferOffsetMaxFileSize: ${SW_RECEIVER_BUFFER_OFFSET_MAX_FILE_SIZE:100} # Unit is MB

bufferDataMaxFileSize: ${SW_RECEIVER_BUFFER_DATA_MAX_FILE_SIZE:500} # Unit is MB

bufferFileCleanWhenRestart: ${SW_RECEIVER_BUFFER_FILE_CLEAN_WHEN_RESTART:false}

sampleRate: ${SW_TRACE_SAMPLE_RATE:10000} # The sample rate precision is 1/10000. 10000 means 100% sample in default.

receiver-jvm:

default:

#service-mesh:

#  default:

#    bufferPath: ${SW_SERVICE_MESH_BUFFER_PATH:../mesh-buffer/}  # Path to trace buffer files, suggest to use absolute path

#    bufferOffsetMaxFileSize: ${SW_SERVICE_MESH_OFFSET_MAX_FILE_SIZE:100} # Unit is MB

#    bufferDataMaxFileSize: ${SW_SERVICE_MESH_BUFFER_DATA_MAX_FILE_SIZE:500} # Unit is MB

#    bufferFileCleanWhenRestart: ${SW_SERVICE_MESH_BUFFER_FILE_CLEAN_WHEN_RESTART:false}

#istio-telemetry:

#  default:

#receiver_zipkin:

#  default:

#    host: ${SW_RECEIVER_ZIPKIN_HOST:0.0.0.0}

#    port: ${SW_RECEIVER_ZIPKIN_PORT:9411}

#    contextPath: ${SW_RECEIVER_ZIPKIN_CONTEXT_PATH:/}

query:

graphql:

path: ${SW_QUERY_GRAPHQL_PATH:/graphql}

alarm:

default:

telemetry:

none:


datasource-settings.properties配置如下:

jdbcUrl=jdbc:mysql://192.168.1.100:3306/swtest

dataSource.user=root

dataSource.password=MyNewPass4!

dataSource.cachePrepStmts=true

dataSource.prepStmtCacheSize=250

dataSource.prepStmtCacheSqlLimit=2048

dataSource.useServerPrepStmts=true

dataSource.useLocalSessionState=true

dataSource.rewriteBatchedStatements=true

dataSource.cacheResultSetMetadata=true

dataSource.cacheServerConfiguration=true

dataSource.elideSetAutoCommits=true

dataSource.maintainTimeStats=false


在bin目录下执行startup.bat

在log目录下查看两个日志文件

skywalking-oap-server.log

webapp.log

没报错打开http://localhost:8080查看

拷贝三个agent并修改文件名

分别修改这三个agent的agent.config配置文件

agent.service_name=gz-auth

collector.backend_service=127.0.0.1:11800

logging.level=debug


agent.service_name=gz-gate

collector.backend_service=127.0.0.1:11800

logging.level=debug


agent.service_name=gz-admin

collector.backend_service=127.0.0.1:11800

logging.level=debug


分别用这三个agent启动,在vm启动参数配置agent

参考配置

-javaagent:D:\Workspace\Others\hello-spring-cloud-alibaba\hello-spring-cloud-external-skywalking\agent\skywalking-agent.jar 

-Dskywalking.agent.service_name=nacos-provider -

查看agent目录下的log

启动成功


在web端查看三个应用的调用情况

SkyWalking Trace 监控

SkyWalking 通过业务调用监控进行依赖分析,提供给我们了服务之间的服务调用拓扑关系、以及针对每个 Endpoint 的 Trace 记录。

调用链路监控

点击 Trace 菜单,进入追踪页

点击 Trace ID 展开详细信息

上图展示了一次正常的响应,总响应时间为 185ms 共有一个 Span(基本工作单元,表示一次完整的请求,包含响应,即请求并响应)

Span /echo/{message} 说明如下:

  • Duration:响应时间 185 毫秒

  • component:组件类型为 SpringMVC

  • url:请求地址

  • http.method:请求类型

服务性能指标监控

点击 Service 菜单,进入服务性能指标监控页

选择希望监控的服务

  • Avg SLA: 服务可用性(主要是通过请求成功与失败次数来计算)

  • CPM: 每分钟调用次数

  • Avg Response Time: 平均响应时间

点击 More Server Details... 还可以查看详细信息

上图中展示了服务在一定时间范围内的相关数据,包括:

  • 服务可用性指标 SLA

  • 每分钟平均响应数

  • 平均响应时间

  • 服务进程 PID

  • 服务所在物理机的 IP、Host、OS

  • 运行时 CPU 使用率

  • 运行时堆内存使用率

  • 运行时非堆内存使用率

  • GC 情况

评论