插件集成
Prometheus
Prometheus是一个开源的监控和警报工具集,目标是提供可靠的实时监控和警报,专为动态的云环境和微服务架构设计。
$ kubectl apply -f https://raw.githubusercontent.com/istio/istio/release-1.20/samples/addons/prometheus.yaml
在istio中,通过开启指标合并
将prometheus.io
注解添加到sidecar中来设置收集指标(若已存在会被覆盖),并通过svc:15020/stats/prometheus
暴露给prometheus收集。
配置istio开启指标合并
$ kubectl -n istio-system apply -f - <<EOF
apiVersion: v1
kind: ConfigMap
metadata:
name: istio
namespace: istio-system
data:
mesh: |-
defaultConfig:
discoveryAddress: istiod.istio-system.svc:15012
defaultProviders:
metrics:
- prometheus
enablePrometheusMerge: true
rootNamespace: istio-system
trustDomain: cluster.local
EOF
prometheus通过注解对服务指标进行抓取
apiVersion: v1
kind: Pod
metadata:
annotations:
istio.io/rev: default
kubectl.kubernetes.io/default-container: tcp-echo
kubectl.kubernetes.io/default-logs-container: tcp-echo
prometheus.io/path: /stats/prometheus
prometheus.io/port: "15020"
prometheus.io/scrape: "true"# 是否要抓取该服务的数据
sidecar.istio.io/status: '{"initContainers":["istio-init"],"containers":["istio-proxy"],"volumes":["workload-socket","credential-socket","workload-certs","istio-envoy","istio-data","istio-podinfo","istio-token","istiod-ca-cert"],"imagePullSecrets":null,"revision":"default"}'
creationTimestamp: "2024-01-29T01:41:57Z"
generateName: tcp-echo-54b7ccdc9-
labels:
app: tcp-echo
pod-template-hash: 54b7ccdc9
security.istio.io/tlsMode: istio
service.istio.io/canonical-name: tcp-echo
service.istio.io/canonical-revision: v1
version: v1
name: tcp-echo-54b7ccdc9-m4g6t
namespace: test
修改svc,基于NodePort对外暴露
$ kubectl -n istio-system patch svc prometheus --type='json' -p='[{"op":"replace","path":"/spec/type","value":"NodePort"},{"op":"replace","path":"/spec/ports/0/nodePort","value":30009}]'
浏览器访问IP:30009查看prometheus服务能否正常响应
Grafana
Grafana是一个开源的数据可视化和监控平台,用于创建和共享动态、交互式的仪表板。
$ kubectl apply -f https://raw.githubusercontent.com/istio/istio/master/samples/addons/grafana.yaml
修改svc,基于NodePort对外暴露
$ kubectl -n istio-system patch svc grafana --type='json' -p='[{"op":"replace","path":"/spec/type","value":"NodePort"},{"op":"replace","path":"/spec/ports/0/nodePort","value":30000}]'
浏览器访问IP:30000查看grafana服务能否正常响应
OpenTelemetry
OpenTelemetry是一个可观测性框架和工具包,旨在创建和管理遥测数据,例如跟踪、指标和日志,它可以与jaeger、prometheus等可观测性工具一起使用。其中OpenTelemetry Collector(otel)用于接收、处理和导出遥测数据,并持久化到指定组件。
部署otel-collector组件用来收集envoy的访问日志
$ kubectl -n istio-system apply -f https://raw.githubusercontent.com/istio/istio/release-1.20/samples/open-telemetry/otel.yaml
修改istio配置,用来支持otel
$ kubectl -n istio-system apply -f - <<EOF
apiVersion: v1
kind: ConfigMap
metadata:
name: istio
namespace: istio-system
data:
mesh: |-
defaultConfig:
discoveryAddress: istiod.istio-system.svc:15012
defaultProviders:
metrics:
- prometheus
enablePrometheusMerge: true
accessLogFile: /dev/stdout
extensionProviders:
- name: otel
envoyOtelAls:
service: opentelemetry-collector.istio-system
port: 4317
rootNamespace: istio-system
trustDomain: cluster.local
EOF
Grafana-Loki
一个由Grafana Labs开发的开源日志聚合系统,旨在为云原生架构提供高效的日志处理解决方案。作用与ELK中的logstash
类似,用来存储日志,基于标签
提供日志查询的接口和数据过滤等功能。
安装loki组件,并修改otel-collector相关配置
$ kubectl -n istio-system apply -f https://raw.githubusercontent.com/istio/istio/release-1.20/samples/addons/loki.yaml
$ kubectl -n istio-system apply -f https://raw.githubusercontent.com/istio/istio/release-1.20/samples/open-telemetry/loki/otel.yaml
修改istio的配置
$ kubectl -n istio-system apply -f - <<EOF
apiVersion: v1
kind: ConfigMap
metadata:
name: istio
namespace: istio-system
data:
mesh: |-
defaultConfig:
discoveryAddress: istiod.istio-system.svc:15012
defaultProviders:
metrics:
- prometheus
enablePrometheusMerge: true
accessLogFile: /dev/stdout
extensionProviders:
- name: otel
envoyOtelAls:
service: opentelemetry-collector.istio-system
port: 4317
logFormat:
text: "[%START_TIME%] \"%REQ(:METHOD)% %REQ(X-ENVOY-ORIGINAL-PATH?:PATH)% %PROTOCOL%\" %UPSTREAM_LOCAL_ADDRESS%"
labels:
pod: "%ENVIRONMENT(POD_NAME)%"
namespace: "%ENVIRONMENT(POD_NAMESPACE)%"
cluster: "%ENVIRONMENT(ISTIO_META_CLUSTER_ID)%"
mesh: "%ENVIRONMENT(ISTIO_META_MESH_ID)%"
rootNamespace: istio-system
trustDomain: cluster.local
EOF
logFormat.labels的作用就是给输出到loki的日志
添加标签
,在grafana的界面可以通过标签检索到对应的日志。
修改grafana和loki的svc为NodePort并对外暴露30000和30010端口
$ kubectl -n istio-system patch svc grafana --type='json' -p='[
{"op":"replace","path":"/spec/type","value":"NodePort"},
{"op":"replace","path":"/spec/ports/0/nodePort","value":30000}
]'
$ kubectl -n istio-system patch svc loki --type='json' -p='[
{"op":"replace","path":"/spec/type","value":"NodePort"},
{"op":"replace","path":"/spec/ports/0/nodePort","value":30010}
]'
Zipkin
信息
Zipkin是一个分布式追踪系统。它有助于收集排除服务体系结构中的延迟问题所需的定时数据。功能包括数据收集和查找。
$ kubectl apply -f https://raw.githubusercontent.com/istio/istio/release-1.20/samples/addons/extras/zipkin.yaml
因为使用了kind部署测试环境,之前开放了docker的30000-30020端口,所以这里需要修改zipkin的NodePort端口为30014
$ kubectl -n istio-system patch svc zipkin --type='json' -p='[{"op":"replace","path":"/spec/type","value":"NodePort"},{"op":"replace","path":"/spec/ports/0/nodePort","value":30014}]'
zipkin追踪数据采集支持2种方式,分别通过defaultConfig
和extensionProviders
方式。
$ kubectl apply -f - <<EOF
apiVersion: v1
kind: ConfigMap
metadata:
namespace: istio-system
name: istio
data:
mesh: |-
defaultConfig:
discoveryAddress: istiod.istio-system.svc:15012
tracing:
zipkin:
address: zipkin.istio-system:9411
sampling: 100
defaultProviders:
metrics:
- prometheus
enablePrometheusMerge: true
enableTracing: true
accessLogFile: /dev/stdout
rootNamespace: istio-system
trustDomain: cluster.local
EOF
$ kubectl apply -f - <<EOF
apiVersion: v1
kind: ConfigMap
metadata:
namespace: istio-system
name: istio
data:
mesh: |-
defaultConfig:
discoveryAddress: istiod.istio-system.svc:15012
tracing: {} # 禁用默认的tracing配置
defaultProviders:
metrics:
- prometheus
enablePrometheusMerge: true
enableTracing: true
accessLogFile: /dev/stdout
extensionProviders:
- name: zipkin
zipkin:
service: zipkin.istio-system.svc.cluster.local
port: 9411
maxTagLength: 256 # span标签长度(默认256)
rootNamespace: istio-system
trustDomain: cluster.local
---
apiVersion: telemetry.istio.io/v1alpha1
kind: Telemetry
metadata:
name: mesh-default
namespace: istio-system
spec:
tracing:
- providers:
- name: "zipkin"
randomSamplingPercentage: 100.0 # 采样率
EOF
- 需要通过设置
tracing: {}
来禁用全局tracing的默认配置:zipkin:9411
。- 全局默认方式配置需要
重启对应的服务
,插件则不需要重启,但需要额外配置Telemetry
。
访问节点IP:30014
来查看zipkin页面
Jaeger
信息
Jaeger是一个开源的、端到端的分布式追踪系统,用于监视和调试复杂的微服务架构。
$ kubectl delete -f https://raw.githubusercontent.com/istio/istio/release-1.20/samples/addons/extras/zipkin.yaml
$ kubectl apply -f https://raw.githubusercontent.com/istio/istio/release-1.20/samples/addons/jaeger.yaml
修改jaeger svc类型为NodePort
$ kubectl -n istio-system patch svc tracing --type='json' -p='[{"op":"replace","path":"/spec/type","value":"NodePort"},{"op":"replace","path":"/spec/ports/0/nodePort","value":30014}]'
因为jaeger和zipkin存在相同的service名称,先删除zipkin后再安装jaeger
和zipkin一样,jaeger也支持全局和插件的方式修改配置,将追踪数据推送到jaeger。
$ kubectl -n istio-system apply -f - <<EOF
apiVersion: v1
kind: ConfigMap
metadata:
name: istio
namespace: istio-system
data:
mesh: |-
defaultConfig:
discoveryAddress: istiod.istio-system.svc:15012
tracing:
zipkin:
address: jaeger-collector.istio-system:9411
sampling: 100.0
defaultProviders:
metrics:
- prometheus
enablePrometheusMerge: true
enableTracing: true
accessLogFile: /dev/stdout
rootNamespace: istio-system
trustDomain: cluster.local
EOF
$ kubectl apply -f - <<EOF
apiVersion: v1
kind: ConfigMap
metadata:
namespace: istio-system
name: istio
data:
mesh: |-
defaultConfig:
discoveryAddress: istiod.istio-system.svc:15012
tracing: {} # 禁用默认的tracing配置
defaultProviders:
metrics:
- prometheus
enablePrometheusMerge: true
enableTracing: true
accessLogFile: /dev/stdout
extensionProviders:
- name: zipkin
zipkin:
service: zipkin.istio-system.svc.cluster.local
port: 9411
maxTagLength: 256 # span标签长度(默认256)
rootNamespace: istio-system
trustDomain: cluster.local
---
apiVersion: telemetry.istio.io/v1alpha1
kind: Telemetry
metadata:
name: mesh-default
namespace: istio-system
spec:
tracing:
- providers:
- name: "zipkin"
randomSamplingPercentage: 100.0 # 采样率
EOF
- jaeger通过全局配置方式修改后需要重启对应的服务。
- jaeger实现了zipkin的api,并对外暴露了名为zipkin的service,所以地址仍是
zipkin.istio-system
.
访问节点IP:30014
来查看jaeger页面
Kiali
信息
kiali是一个用于在微服务架构中提供监控和可观测性的开源工具。通过提供可视化的监控和分析工具,Kiali使用户能够快速诊断问题
、优化性能
,并确保微服务系统的可靠运行。
# depend on prometheus & kiali & grafana & jaeger
$ kubectl apply -f https://raw.githubusercontent.com/istio/istio/master/samples/addons/prometheus.yaml
$ kubectl apply -f https://raw.githubusercontent.com/istio/istio/master/samples/addons/grafana.yaml
$ kubectl apply -f https://raw.githubusercontent.com/istio/istio/master/samples/addons/jaeger.yaml
$ kubectl apply -f https://raw.githubusercontent.com/istio/istio/master/samples/addons/kiali.yaml
kiali的分布式追踪依靠jaeger来实现
# 修改kiali配置
$ kubectl -n istio-system edit cm kiali -o yaml
# 修改如下配置
external_services:
grafana:
enabled: true
url: 'http://grafana.istio-system:3000/'
tracing:
enable: true
use_grpc: false
in_cluster_url: "http://tracing.istio-system/"
prometheus:
enable: true
url: "http://prometheus.istio-system:9090/"
修改kiali svc类型为NodePort
$ kubectl -n istio-system patch svc kiali --type='json' -p='[{"op":"replace","path":"/spec/type","value":"NodePort"},{"op":"replace","path":"/spec/ports/0/nodePort","value":30017}]'
访问IP:30017
查看kiali页面
Skywalking
Apache SkyWalking 是一个专门设计用于微服务、 云原生和容器等架构的应用性能监控 (APM) 系统。SkyWalking是可观测性的一站式解决方案, 不仅具有像Jaeger和Zipkin的分布式追踪能力,像Prometheus和Grafana的指标能力,像Kiali的日志记录能力,还能将可观测性扩展到许多其他场景, 例如将日志与链路关联,收集系统事件并将事件与指标关联,基于eBPF的服务性能分析等。
安装skywalking
$ kubectl apply -f https://raw.githubusercontent.com/istio/istio/release-1.20/samples/addons/extras/skywalking.yaml
修改istio配置,向skywalking发送链路追踪数据
$ kubectl -n istio-system apply -f - <<EOF
apiVersion: v1
kind: ConfigMap
metadata:
name: istio
namespace: istio-system
data:
mesh: |-
defaultConfig:
discoveryAddress: istiod.istio-system.svc:15012
envoyAccessLogService:
address: skywalking-oap.istio-system:11800
defaultProviders:
metrics:
- prometheus
tracing:
- skywalking
enablePrometheusMerge: true
enableEnvoyAccessLogService: true
enableTracing: true
accessLogFile: /dev/stdout
extensionProviders:
- name: skywalking
skywalking:
service: tracing.istio-system
port: 11800
- name: otel
envoyOtelAls:
service: opentelemetry-collector.istio-system
port: 4317
rootNamespace: istio-system
trustDomain: cluster.local
EOF
开启链路追踪功能,并指定采样率
$ kubectl apply -f - <<EOF
apiVersion: telemetry.istio.io/v1alpha1
kind: Telemetry
metadata:
name: mesh-default
namespace: istio-system
spec:
tracing:
- providers:
- name: "skywalking"
randomSamplingPercentage: 100.0
EOF
修改skywalking的svc为NodePort
$ kubectl -n istio-system patch svc skywalking-ui --type='json' -p='[{"op":"replace","path":"/spec/type","value":"NodePort"},{"op":"replace","path":"/spec/ports/0/nodePort","value":30008}]'
访问IP:30008查看skywalking的web页面
分布式追踪系统对比
类别 | Apache SkyWalking | Jaeger | Zipkin |
---|---|---|---|
实现方式 | 基于语言的探针、服务网格探针、eBPF agent、第三方指标库(当前支持 Zipkin) | 基于语言的探针 | 基于语言的探针 |
数据存储 | ES、H2、MySQL、TiDB、Sharding-sphere、BanyanDB | ES、MySQL、Cassandra、内存 | ES、MySQL、Cassandra、内存 |
支持语言 | Java、Rust、PHP、NodeJS、Go、Python、C++、.NET、Lua | Java、Go、Python、NodeJS、C#、PHP、Ruby、C++ | Java、Go、Python、NodeJS、C#、PHP、Ruby、C++ |
发起者 | 个人 | Uber | |
治理方式 | Apache Foundation | CNCF | CNCF |
版本 | 9.3.0 | 1.39.0 | 2.23.19 |